If you’re reading this article, you’re likely building a new product, website, section, or feature and you’ve been tasked with designing the flow and information architecture.
You’re probably already aware that to design a product that your customers will find easy to use, you’ll need to apply the usability heuristic that requires your design to match your users “real world” by emulating concepts, models, and conventions that your users are already familiar with. So, to help you understand the mental model of your target users, you’ll likely want to carry out a card sorting exercise, amongst various other research methodologies.
First off, if you’ve never conducted a card sorting exercise before, you should definitely start by reading Open vs. closed vs. hybrid card sorting for UX research.
Open card sorting is the ideal scenario for a similarity matrix. Participants create their own categories, allowing the matrix to capture the relationships between items based on how users naturally group them. This reveals valuable insights into their mental models.
With hybrid card sorting, participants can either use predefined categories or create their own subcategories within them. The similarity matrix can still be used to analyze how participants grouped items within the existing categories, providing valuable insights for refining the information architecture.
Closed card sorting, on the other hand, uses predefined categories. So, while a similarity matrix can be calculated, it might not be as informative since participants can’t create their own categories. The key limitations of using a similarity matrix in closed card sorting are that:
Once you’re up to speed with the concept of an open card sorting, you can then go ahead to carry out the exercise. In this exercise, you essentially provide customers with a number of cards representing various concepts, ideas, or topics and you allow them to create “themes” or “categories” and group the cards which they consider similar under the same theme without any predefined categories from you.
In designing your product, this kind of exercise will give you fantastic insights into how to organize the information in your website in a way that matches your target customer’s mental model, ensuring that they can easily find what they are looking for.
Here’s what an open card sorting exercise might look like to participants:
A similarity matrix is a tool used to quantify how similar things are to each other. In product design and development, it helps us analyze data by identifying patterns and relationships between different elements. In the context of this article, it will help you analyze the relationships between items or concepts based on participants’ responses during a card sorting exercise.
During your open card sorting session, you’ll ask participants to organize a set of cards (representing items or concepts) into groups or categories based on their own mental models or understanding. After participants have completed the sorting task, you’ll collect their responses and analyze it to understand the relationships and similarities between the items.
A similarity matrix in this context is a table or matrix where each row and column represents an item or concept, and the cells of the matrix indicate the frequency or degree to which pairs of items were sorted into the same category by participants. The values in the cells typically represent the number of times two items were grouped together or a measure of similarity, such as a percentage or score.
Analyzing the similarity matrix will allow you to identify patterns and clusters of items that tend to be grouped together by participants. This information will help to inform the organization and structure of the content, navigation, and information architecture in your product, website, or system, ultimately leading to a seamless user experience:
Product Categories | Category A | Category B | Category C | Category D |
Category A |
1.00 |
|||
Category B |
0.80 |
1.00 |
||
Category C |
0.50 |
0.70 |
1.00 |
|
Category D |
0.60 |
0.40 |
0.90 |
1.00 |
Here’s an example of a similarity matrix depicting the similarity between four different product categories, coded A, B, C and D. In this example:
This may seem complicated, but similarity matrixes can be quite useful:
Here’s a breakdown of creating a similarity matrix in Excel, including a template linked here:
Let’s say you have a set of 10 cards in an open card sorting exercise. You then invite participants (let’s call them Participant 1, Participant 2, Participant 3, etc.) to sort these cards into groups based on how they think these categories are related or should be organized. Each participant can create their own groups and label them as they see fit.
During the exercise, you observe and record how each participant groups the cards. You will note what groups or categories were created by the participants and which cards were placed into each of these groups/categories by each participant.
Here’s an example of how you might record this data:
Collect raw data, for example: | |||||
G1 | G2 | G3 | G4 | G5 | |
P1 | C1, C2, C3 | C6, C9 | C7, C8 | C4, C5 | |
P2 | C1, C4 | C2, C3 | C5, C6 | C7, C8, C9, C10 | |
P3 | C3, C4, C8 | C1, C2 | C9, C10 | C5, C6, C7 | |
P4 | C5, C6, C7 | C1, C3 | C2, C4, C8, C9 | C10 | |
P5 | C1, C7 | C2, C4, C8 | C3, C5 | C6, C9, C10 | |
P6 | C2, C3, C4, C5 | C1 | C6, C7, C8, C9 | C10 | |
P7 | C8, C9, C10 | C1, C2, C3, C4 | C5, C6, C7 | ||
P8 | C4, C5, C6 | C1, C2, C3 | C7, C8, C9 | C10 | |
P9 | C2, C5, C7, C8 | C1, C3, C4 | C6, C9, C10 |
In this table:
The current data format shows each participant’s groups and the cards within those groups. To calculate the similarity matrix, we need to transform this data to show which cards each participant grouped together.
Here’s how to achieve this:
Here’s what your data will now look like:
P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | |
C1 | G1 | G5 | G5 | G5 | G2 | G5 | G5 | G1 | G5 | G2 |
C2 | G4 | G5 | G5 | G4 | G2 | G3 | G4 | G5 | G5 | G5 |
C3 | G5 | G5 | G5 | G5 | G2 | G2 | G2 | G2 | G2 | G2 |
C4 | G1 | G1 | G2 | G1 | G2 | G1 | G2 | G1 | G2 | G1 |
C5 | G4 | G4 | G4 | G1 | G2 | G4 | G4 | G4 | G2 | G2 |
C6 | G3 | G3 | G3 | G1 | G3 | G1 | G3 | G3 | G3 | G3 |
C7 | G4 | G4 | G4 | G1 | G4 | G4 | G4 | G4 | G1 | G4 |
C8 | G4 | G4 | G4 | G1 | G4 | G4 | G4 | G4 | G1 | G4 |
C9 | G3 | G3 | G3 | G3 | G3 | G3 | G3 | G3 | G3 | G3 |
C10 | G1 | G1 | G3 | G1 | G1 | G1 | G1 | G3 | G3 | G1 |
First, carry out a frequency analysis by counting how many times each card is placed in a specific group by the participants. This will give you insights into commonalities or differences in how participants perceive the categories.
We will use the COUNTIF function for this:
Here’s what your data will look like now:
G1 | G2 | G3 | G4 | G5 | |
C1 |
2 |
2 |
0 |
0 |
6 |
C2 |
0 |
1 |
1 |
3 |
5 |
C3 |
0 |
6 |
0 |
0 |
4 |
C4 |
6 |
4 |
0 |
0 |
0 |
C5 |
1 |
3 |
0 |
6 |
0 |
C6 |
2 |
0 |
8 |
0 |
0 |
C7 |
2 |
0 |
0 |
8 |
0 |
C8 |
2 |
0 |
0 |
8 |
0 |
C9 |
0 |
0 |
10 |
0 |
0 |
C10 |
7 |
0 |
3 |
0 |
0 |
Create a similarity matrix to see how similar participants’ groupings are. We will be using the following formulas: COUNT, IF, SUM, and ARRAY formulas to calculate the number of shared cards between two participants’ groups.
Here’s what your similarity matrix will look like:
C1 | C2 | C3 | C4 | C5 | C6 | C7 | C8 | C9 | C10 | |
C1 |
1 |
0 |
0.4 |
0.4 |
0.2 |
0.4 |
0.4 |
0.4 |
0.2 |
0.2 |
C2 |
0 |
1 |
0.2 |
0 |
0 |
0 |
0 |
0 |
0.2 |
0 |
C3 |
0.4 |
0.2 |
1 |
0.4 |
0.2 |
0.2 |
0.2 |
0.2 |
0.4 |
0.2 |
C4 |
0.4 |
0 |
0.4 |
1 |
0.4 |
0.4 |
0.4 |
0.4 |
0.4 |
0.4 |
C5 |
0.2 |
0 |
0.2 |
0.4 |
1 |
0.2 |
0.4 |
0.4 |
0.2 |
0.2 |
C6 |
0.4 |
0 |
0.2 |
0.4 |
0.2 |
1 |
0.6 |
0.6 |
0.6 |
0.6 |
C7 |
0.4 |
0 |
0.2 |
0.4 |
0.4 |
0.6 |
1 |
1 |
0.4 |
0.4 |
C8 |
0.4 |
0 |
0.2 |
0.4 |
0.4 |
0.6 |
1 |
1 |
0.4 |
0.4 |
C9 |
0.2 |
0.2 |
0.4 |
0.4 |
0.2 |
0.6 |
0.4 |
0.4 |
1 |
0.6 |
C10 |
0.2 |
0 |
0.2 |
0.4 |
0.2 |
0.6 |
0.4 |
0.4 |
0.6 |
1 |
In this similarity matrix, each cell represents the similarity or association between two cards based on how frequently they were grouped together in the card sorting exercise.
Here’s how you can interpret the data:
You can use this matrix to infer which cards tended to be grouped together frequently (higher values) and which cards were less likely to be grouped together (lower values). For instance, cells C7–C8 and C8–C7 both have a value of 1 in the similarity matrix, which means that cards C7 and C8 were consistently grouped together in the card sorting exercise by the participants.
The value of 1 indicates a perfect similarity, implying that whenever C7 appeared in a group, C8 was also present, and vice versa, across all instances of sorting. This high similarity suggests a strong association or relationship between cards C7 and C8 in the participants’ grouping preferences. This would be a key insight from the card sorting exercise.
I’ve covered the basics, but these tips can help you analyze your similarity matrix and get better insights. Sometimes, making sense of your data is the hardest part!
Consider normalizing your data, especially if the number of times participants sorted cards varied significantly. This can involve dividing each cell value by the total number of times a card appeared in any category. Normalization ensures all cards are compared on a fair basis, regardless of how often they were individually sorted.
Use conditional formatting in your spreadsheet to emphasize high co-occurrence values and highlight strong relationships. For example, cells with a darker shade could indicate cards frequently grouped together, while lighter shades represent weaker relationships.
This allows you to visually identify strong associations at a glance. Densely colored areas will signify frequently grouped cards, while sparse areas pinpoint concepts that need further attention.
Use a dendrogram (tree diagram) alongside your similarity matrix. This can reveal potential hierarchical structures within your data. You could imagine the matrix as a landscape, and the dendrogram as a set of branching trees growing from the base.
Cards with strong relationships cluster together on the branches, forming subcategories or child categories within a broader information architecture. This kind of visualisation can be particularly helpful for complex card sorting exercises where there are many items.
While the manual method and spreadsheets work well for basic analysis of small groups of data, dedicated card sorting software like Maze and UXtweak, can automate many of these steps for simpler analysis of larger datasets. They also offer additional features like:
We’ve shown how a similarity matrix helps you understand the relationships between different cards based on the participants’ sorting behavior. It provides insights into which cards are conceptually related or similar in the participants’ perception.
Hence, if you’re looking for a tool that will provide a clear and efficient way to analyze card sorting data, helping you identify user-driven information structures for your website, app or product, then look no further than a similarity matrix. The template provided will help you carry out your open card sorting exercises with ease.
Header image source: IconScout
LogRocket lets you replay users' product experiences to visualize struggle, see issues affecting adoption, and combine qualitative and quantitative data so you can create amazing digital experiences.
See how design choices, interactions, and issues affect your users — get a demo of LogRocket today.
Nostalgia-driven aesthetics is a real thing. In this blog, I talk all about 90s website designs — from grunge-inspired typography to quirky GIFs and clashing colors — and what you can learn from them.
You’ll need to read this blog through and through to know what’s working and what’s not in your design. In this one, I break down key performance metrics like task error rates and system performance.
Users see a product; designers see layers. The 5 UX design layers — strategy, scope, structure, skeleton, and surface — help build UIs step by step.
This blog’s all about learning to set up, manage, and use design tokens in design system — all to enable scalable, consistent, and efficient collaboration between designers and developers.