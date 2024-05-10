If you’re reading this article, you’re likely building a new product, website, section, or feature and you’ve been tasked with designing the flow and information architecture.

You’re probably already aware that to design a product that your customers will find easy to use, you’ll need to apply the usability heuristic that requires your design to match your users “real world” by emulating concepts, models, and conventions that your users are already familiar with. So, to help you understand the mental model of your target users, you’ll likely want to carry out a card sorting exercise, amongst various other research methodologies.

Card sorting: A review for your similarity matrix

First off, if you’ve never conducted a card sorting exercise before, you should definitely start by reading Open vs. closed vs. hybrid card sorting for UX research.

Open card sorting is the ideal scenario for a similarity matrix. Participants create their own categories, allowing the matrix to capture the relationships between items based on how users naturally group them. This reveals valuable insights into their mental models.

With hybrid card sorting, participants can either use predefined categories or create their own subcategories within them. The similarity matrix can still be used to analyze how participants grouped items within the existing categories, providing valuable insights for refining the information architecture.

Closed card sorting, on the other hand, uses predefined categories. So, while a similarity matrix can be calculated, it might not be as informative since participants can’t create their own categories. The key limitations of using a similarity matrix in closed card sorting are that:

It can’t capture user dissatisfaction with existing categories It doesn’t reveal alternative category structures participants might prefer

Once you’re up to speed with the concept of an open card sorting, you can then go ahead to carry out the exercise. In this exercise, you essentially provide customers with a number of cards representing various concepts, ideas, or topics and you allow them to create “themes” or “categories” and group the cards which they consider similar under the same theme without any predefined categories from you.

In designing your product, this kind of exercise will give you fantastic insights into how to organize the information in your website in a way that matches your target customer’s mental model, ensuring that they can easily find what they are looking for.

Here’s what an open card sorting exercise might look like to participants:

What is a similarity matrix?

A similarity matrix is a tool used to quantify how similar things are to each other. In product design and development, it helps us analyze data by identifying patterns and relationships between different elements. In the context of this article, it will help you analyze the relationships between items or concepts based on participants’ responses during a card sorting exercise.

During your open card sorting session, you’ll ask participants to organize a set of cards (representing items or concepts) into groups or categories based on their own mental models or understanding. After participants have completed the sorting task, you’ll collect their responses and analyze it to understand the relationships and similarities between the items.

A similarity matrix in this context is a table or matrix where each row and column represents an item or concept, and the cells of the matrix indicate the frequency or degree to which pairs of items were sorted into the same category by participants. The values in the cells typically represent the number of times two items were grouped together or a measure of similarity, such as a percentage or score.

Analyzing the similarity matrix will allow you to identify patterns and clusters of items that tend to be grouped together by participants. This information will help to inform the organization and structure of the content, navigation, and information architecture in your product, website, or system, ultimately leading to a seamless user experience:

Product Categories Category A Category B Category C Category D Category A 1.00 Category B 0.80 1.00 Category C 0.50 0.70 1.00 Category D 0.60 0.40 0.90 1.00

Here’s an example of a similarity matrix depicting the similarity between four different product categories, coded A, B, C and D. In this example:

The rows and columns represent different product categories (Category A, Category B, Category C, Category D)

The numbers in the cells represent the similarity scores between pairs of categories. For instance, the similarity score between Category A and Category B is 0.80, indicating a relatively strong similarity based on the data or criteria used for assessment

The diagonal cells (e.g., Category A vs. Category A, Category B vs. Category B) typically have a similarity score of 1.00, indicating complete similarity (since a category is identical to itself)

Why use a similarity matrix?

This may seem complicated, but similarity matrixes can be quite useful:

A similarity matrix is far more efficient than manually analyzing the card sorts of each of your participants to find common groupings, which can be a tedious process. Using a similarity matrix will help you summarize all this information, allowing you to spot user consensus at a glance. This saves time and effort in uncovering patterns. It’s a more robust methodology that goes beyond categorizations and actually shows similarities on a card-to-card basis. This will help you identify potential subcategories or refine your initial category labels because a similarity matrix reveals cards users intuitively associate with each other, even if they ended up in different categories. It will help you spot cards, topics or areas that need further investigation. If you keep an eye out for sparsely-filled areas in the matrix, you’ll be able to identify cards that weren’t consistently grouped with any others. This can point to poorly defined cards or concepts that users struggled to categorize. You can then investigate these cards further to understand the cause of user confusion and further improve the information architecture.

How to create a similarity matrix to analyze your card sorting data (template included)

Here’s a breakdown of creating a similarity matrix in Excel, including a template linked here:

Step 1: Conducting the card sorting exercise

Let’s say you have a set of 10 cards in an open card sorting exercise. You then invite participants (let’s call them Participant 1, Participant 2, Participant 3, etc.) to sort these cards into groups based on how they think these categories are related or should be organized. Each participant can create their own groups and label them as they see fit.

Step 2: Collect and set up up your data

During the exercise, you observe and record how each participant groups the cards. You will note what groups or categories were created by the participants and which cards were placed into each of these groups/categories by each participant.

Open a new Excel sheet. Create two headers: In the first column (Column A), list the participant names (P1, P2, etc.).

In the first row (starting from Row 2), list the product categories used in the card sorting exercise.

Here’s an example of how you might record this data:

Collect raw data, for example: G1 G2 G3 G4 G5 P1 C1, C2, C3 C6, C9 C7, C8 C4, C5 P2 C1, C4 C2, C3 C5, C6 C7, C8, C9, C10 P3 C3, C4, C8 C1, C2 C9, C10 C5, C6, C7 P4 C5, C6, C7 C1, C3 C2, C4, C8, C9 C10 P5 C1, C7 C2, C4, C8 C3, C5 C6, C9, C10 P6 C2, C3, C4, C5 C1 C6, C7, C8, C9 C10 P7 C8, C9, C10 C1, C2, C3, C4 C5, C6, C7 P8 C4, C5, C6 C1, C2, C3 C7, C8, C9 C10 P9 C2, C5, C7, C8 C1, C3, C4 C6, C9, C10

In this table:

Participant name : Lists the names or identifiers of the participants. So P1 = Participant 1

: Lists the names or identifiers of the participants. So P1 = Participant 1 Group columns (Group 1, Group 2, etc.) : Each column represents a group created by the participants during the exercise. So G1 = Group 1

: Each column represents a group created by the participants during the exercise. So G1 = Group 1 Input cards: The cells in these columns contain the cards from the sorting exercise that the participant placed in each group. So C1 = Card 1

Step 3: Clean up the data for easier analysis

The current data format shows each participant’s groups and the cards within those groups. To calculate the similarity matrix, we need to transform this data to show which cards each participant grouped together.

Here’s how to achieve this:

In a new sheet within the same spreadsheet, add card headers (C1–C10) to each participant row (P1–P10) so the first column captures C1-C10 and the first row captures P1–P10. In the cells, state which groups each participant placed each of the 10 cards (G1–G5).

Here’s what your data will now look like:

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 C1 G1 G5 G5 G5 G2 G5 G5 G1 G5 G2 C2 G4 G5 G5 G4 G2 G3 G4 G5 G5 G5 C3 G5 G5 G5 G5 G2 G2 G2 G2 G2 G2 C4 G1 G1 G2 G1 G2 G1 G2 G1 G2 G1 C5 G4 G4 G4 G1 G2 G4 G4 G4 G2 G2 C6 G3 G3 G3 G1 G3 G1 G3 G3 G3 G3 C7 G4 G4 G4 G1 G4 G4 G4 G4 G1 G4 C8 G4 G4 G4 G1 G4 G4 G4 G4 G1 G4 C9 G3 G3 G3 G3 G3 G3 G3 G3 G3 G3 C10 G1 G1 G3 G1 G1 G1 G1 G3 G3 G1

Step 4: Apply a frequency analysis

First, carry out a frequency analysis by counting how many times each card is placed in a specific group by the participants. This will give you insights into commonalities or differences in how participants perceive the categories.

We will use the COUNTIF function for this:

In a new sheet, make C1–C10 the first column and G1–G5 the first row. In cell B2 type this formula: =COUNTIF(Sheet2!$B3:$K3, $B$2) Copy the formula down the column In the next row, edit the formula to reflect the next column (C2) like this =COUNTIF(Sheet2!$B3:$K3, $C$2) Repeat this till you reach the last column of groups

Here’s what your data will look like now:

G1 G2 G3 G4 G5 C1 2 2 0 0 6 C2 0 1 1 3 5 C3 0 6 0 0 4 C4 6 4 0 0 0 C5 1 3 0 6 0 C6 2 0 8 0 0 C7 2 0 0 8 0 C8 2 0 0 8 0 C9 0 0 10 0 0 C10 7 0 3 0 0

Step 5: Create a similarity matrix

Create a similarity matrix to see how similar participants’ groupings are. We will be using the following formulas: COUNT, IF, SUM, and ARRAY formulas to calculate the number of shared cards between two participants’ groups.

In a new sheet (Sheet4), C1–C10 will be the first column and the row header. Enter the formula: =ArrayFormula(SUM(IF(Sheet3!$B3:$F3=Sheet3!$B$3:$F$3,1,0))/COUNT(Sheet3!$B3:$F3)) Copy this formula down the first column For the next column, edit the formula to capture the next column (B4:F4) in Sheet3, like this: =ArrayFormula(SUM(IF(Sheet3!$B3:$F3=Sheet3!$B$4:$F$4,1,0))/COUNT(Sheet3!$B3:$F3)) Repeat this till you reach the last column.

Here’s what your similarity matrix will look like:

C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C1 1 0 0.4 0.4 0.2 0.4 0.4 0.4 0.2 0.2 C2 0 1 0.2 0 0 0 0 0 0.2 0 C3 0.4 0.2 1 0.4 0.2 0.2 0.2 0.2 0.4 0.2 C4 0.4 0 0.4 1 0.4 0.4 0.4 0.4 0.4 0.4 C5 0.2 0 0.2 0.4 1 0.2 0.4 0.4 0.2 0.2 C6 0.4 0 0.2 0.4 0.2 1 0.6 0.6 0.6 0.6 C7 0.4 0 0.2 0.4 0.4 0.6 1 1 0.4 0.4 C8 0.4 0 0.2 0.4 0.4 0.6 1 1 0.4 0.4 C9 0.2 0.2 0.4 0.4 0.2 0.6 0.4 0.4 1 0.6 C10 0.2 0 0.2 0.4 0.2 0.6 0.4 0.4 0.6 1

In this similarity matrix, each cell represents the similarity or association between two cards based on how frequently they were grouped together in the card sorting exercise.

Here’s how you can interpret the data:

The diagonal cells (e.g., C1-C1, C2-C2, etc.) all have a value of 1. This indicates that each card is 100 percent similar to itself, which is expected since a card is always identical to itself. Values close to 1 (e.g., 0.6, 0.4, etc.) suggest a high similarity between the respective pairs of cards. For instance, C1 and C3 have a similarity value of 0.6, indicating that they were often grouped together during the exercise Values closer to 0 (e.g., 0.2, 0.4, etc.) indicate a lower similarity between the cards. For example, C2 and C4 have a similarity of 0. This implies that they were rarely or never grouped together in the participants’ sorting

You can use this matrix to infer which cards tended to be grouped together frequently (higher values) and which cards were less likely to be grouped together (lower values). For instance, cells C7–C8 and C8–C7 both have a value of 1 in the similarity matrix, which means that cards C7 and C8 were consistently grouped together in the card sorting exercise by the participants.

The value of 1 indicates a perfect similarity, implying that whenever C7 appeared in a group, C8 was also present, and vice versa, across all instances of sorting. This high similarity suggests a strong association or relationship between cards C7 and C8 in the participants’ grouping preferences. This would be a key insight from the card sorting exercise.

Pro tips to level up your similarity matrix analysis

I’ve covered the basics, but these tips can help you analyze your similarity matrix and get better insights. Sometimes, making sense of your data is the hardest part!

Normalize your data

Consider normalizing your data, especially if the number of times participants sorted cards varied significantly. This can involve dividing each cell value by the total number of times a card appeared in any category. Normalization ensures all cards are compared on a fair basis, regardless of how often they were individually sorted.

Use conditional formatting

Use conditional formatting in your spreadsheet to emphasize high co-occurrence values and highlight strong relationships. For example, cells with a darker shade could indicate cards frequently grouped together, while lighter shades represent weaker relationships.

This allows you to visually identify strong associations at a glance. Densely colored areas will signify frequently grouped cards, while sparse areas pinpoint concepts that need further attention.

Pair with a tree diagram

Use a dendrogram (tree diagram) alongside your similarity matrix. This can reveal potential hierarchical structures within your data. You could imagine the matrix as a landscape, and the dendrogram as a set of branching trees growing from the base.

Cards with strong relationships cluster together on the branches, forming subcategories or child categories within a broader information architecture. This kind of visualisation can be particularly helpful for complex card sorting exercises where there are many items.

Use card sorting software

While the manual method and spreadsheets work well for basic analysis of small groups of data, dedicated card sorting software like Maze and UXtweak, can automate many of these steps for simpler analysis of larger datasets. They also offer additional features like:

Prebuilt templates for data collection

Automatic similarity matrix generation with color coding

Heatmap overlays for visualizing co-occurrence patterns

Advanced clustering algorithms for identifying potential information architectures

Final words

We’ve shown how a similarity matrix helps you understand the relationships between different cards based on the participants’ sorting behavior. It provides insights into which cards are conceptually related or similar in the participants’ perception.

Hence, if you’re looking for a tool that will provide a clear and efficient way to analyze card sorting data, helping you identify user-driven information structures for your website, app or product, then look no further than a similarity matrix. The template provided will help you carry out your open card sorting exercises with ease.

Header image source: IconScout