Using hair care products as an example I will go through an exercise that illustrates some of the tricky issues encountered with categorizing data. Why do hair care products need categorization? Suppose we are creating a hair care shopping web site where consumers are guided to the right hair care products for their particular needs. Surprisingly, even simple shopping web sites need information architecture work (at some point in their lives) and even more surprising is that if you try to design the categorization up front it can get messy so quickly. This leads to the question of whether it even possible to get it right. What is so complex about categorization? Let’s explore…
Hundreds of hair products are out there. Ever since the 90’s I have found there to be a confusing array of options surrounding hair care products. I go into the store looking for a shampoo and end up in an aisle stacked with hundreds of different bottles organized in a way that makes absolutely no sense (at least not to me). Can there really be hundreds of different products? Is this really all shampoo? You know the stuff you wash your hair with? It seems shampooing has transformed, thanks to the marketing genius of the last few decades, from basic hygine into a brave new world of concepts of which I had never before heard. What really matters is how a shampoo makes you feel. Take, for example, this one direct from the label: an aromatheraputic, organic, paraben free, moisture balancing, 100% biodegradeable, no artificial colors, no animal ingredients, ph balanced shampoo for all hair types. I nostalgically remember from my childhood a simple world where there was no confusion. Where’s the shampoo? Oh, its right here, comes in one size, works for everyone. See the label says “shampoo” and it is right next to the package labeled “toothpaste”. Well that’s the way I remember it anyway.
The shampoo product selection can be confusing, but we are not without hope. In my work as an IT Architect, I’ve seen harder categorization problems solved. No, we don’t need no shampoo expert, all we need is a little bit of taxonomy, you know, like Carl Linnaeus did for plants and animals. We’ll layout the kingdoms, classes, orders, genus, species for shampoo then will plug it all into a product configurator, navigate through the choices, then presto-magico a shampoo will be selected for us. And just for fun we’ll throw in the rest of the hair care family of products. It’s that easy. Lets jump in…
Here is a first attempt at the high level categories:
- Hair Care Product
- Shampoo
- Conditioner
- Hair Gel
- Hair Mousse
- Hair Dye
We could say, though, that gel and mousse are kind of the same thing. So I’ll check it out with wikipedia (usually I would look up ISO standards, but I don’t think they cover hair). Wikipedia has an article on gel which gives me a few more categories:
- Hair Care Product
- Hair Spray
- Hair Glue
- Hair Wax
- Ethnic Gel
- Hair Coloring Gel
Not sure quite what the difference is, but level of hold seems to be a factor. Also, wikipedia suggested a type of gel that could go also in the hair dye category. And I can think of another:
- All in One – Shampoo & Conditioner
Issue: not all things fall neatly into one category. This questions the use of single parent hierarchies for all cases.
No, problem we’ll just whip up a poly-hierarchy and put these categories under two parents.
- Hair Care Product
- Shampoo
- All in One – Shampoo & Conditioner
- Conditioner
- All in One – Shampoo & Conditioner
- Hair Styling Product
- Hair Gel
- Ethnic Gel
- Other Gel
- Hair Spray
- Hair Wax
- Hair Coloring Gel
- Hair Gel
- Hair Dye
- Hair Coloring Gel
- Other Hair Dye
- Shampoo
Notice that there are four categories that contain only one sub-category: “Shampoo”, “Conditioner”, “Hair Gel”, and “Hair Dyes”. In all four cases the sub-category by no means covers all of the products in the category. When selecting a category users will typically try to drill down the hierarchy and pick a leaf sub-category. If, for example, I am trying to categorize “Redken Clear Moisture Shampoo” I would drill into “Shampoo” and look for an appropriate sub-category, but none exists. For this reason we may want to create “other” categories to help users pick. (I have seen this situation arise time and time again.)
Issue: the “other” category. Sometimes a situation occurs where items fall into a parent category, but none of the parent’s subcategories make sense. Here you are tempted to create the “other” category to make the hierarchy more user friendly. Sometimes a need for a “None” or “Not Selected” category also arises.
Here is the hierarchy with the “other” categories added.
- Hair Care Product
- Shampoo
- All in One – Shampoo & Conditioner
- Other Shampoos
- Conditioner
- All in One – Shampoo & Conditioner
- Other Conditioners
- Hair Styling Product
- Hair Gel
- Ethnic Gel
- Other Gel
- Hair Spray
- Hair Wax
- Hair Coloring Gel
- Hair Gel
- Hair Dye
- Hair Coloring Gel
- Other Hair Dyes
- Shampoo
Now it seems we have a good start on the root of the hierarchy, but we still don’t have enough sub-categories to easily find one in hundreds of products. Let’s take another stab at the lower levels of hierarchy. Looking out on the web I easily find a online drug store with useful seeming hierarchies of shampoo and conditioner. Now that we have more sub-categories under “Shampoo” and “Conditioners” I can get rid of my “Other” categories:
- Hair Care Product
- Shampoo
- All in One – Shampoo & Conditioner
- moisturizing Shampoo
- dandruff Shampoo
- natural Shampoo
- everyday usage Shampoo
- Children’s Shampoo
- Conditioner
- All in One – Shampoo & Conditioner
- dandruff Conditioner
- natural Conditioner
- Color Treated Hair Conditioner
- Detanglers
- Leave in Conditioner
- Children’s Conditioner
- Shampoo
Shampoos and conditioners have several overlapping concepts (such as “Childrens Conditioner” and “Childrens’s Shampoo” in the list above), but these cannot be solved with a single category node in the poly-hierarchy since they are different categories. The same concept is applied to multiple category nodes. Such a situation usually indicates that we need to pull these concepts out into another dimension of categorization. Maybe we create a “used for” categorization taxonomy:
- Used For
- Cleaning
- Moisturizing
- Dandruff
- Everyday Usage
- Children
- Color Treated Hair
- Detangling
Rule: Separate out overlapping concepts into their own taxonomy. As much as it is possible, a single taxonomy should be a consistent single view point on the problem.
Notice now that this new taxonomy is a little different in the way it gets applied. A single product can be categorized with multiple of these categories where as in our “Hare Care Product” taxonomy we attempted to create categories that would uniquely describe each product. The multiple parent issue we had with “All-in-One” shampoos and conditioners could be solved this way also. It might even work better that way since “all-in-one” could include other functions such as dying hair. This thinking radically changes the hierarchy. Here we have three categories (I have added another to handle the “ingredients” dimension):
- Hair Care Product
- Shampoo
- Conditioner
- Hair Styling Products
- Hair Gel
- Hair Spray
- Hair Wax
- Hair Dye
- Used For
- Cleaning
- Moisturizing
- Dandruff
- Everyday Usage
- Children
- Color Treated Hair
- Detangling
- Ethnic Hair
- Made With
- Natural Ingredients
- Organic Ingredients
- Edible Ingredients
This also changes the way we categorize each product. Rather than picking the category as a single point in a single taxonomy, we pick multiple categories from multiple taxonomies. Here is an example of the categorization for fictitional product:
- Studio X All-in-One Natural Shampoo and Conditioner
- Shampoo
- Conditioner
- Cleaning
- Moisturizing
- Natural Ingredients
What began as a rigid hierarchical classification system has evolved into a more flexible system which is almost as intuitive and in some cases maybe more intuitive. This is good progress for our design. It’s not yet perfect, but it has come a long way. In the end it is worth while challenging the single hierarchy idea. But this new approach does take a little practice to get right. We have to get good at analyzing and picking abstract concepts such as “Used For” and “Made With”. Imagine these abstract concepts as dimesions in multiple dimensional space, try to make them all orthogonal so the concepts work independently taking you to the correct point in your categorizational space.