Creating experiment taxonomy: Organizing for insights

Mon Jun 23 2025

Here's my edited version of the blog:

You know that sinking feeling when you're trying to find results from an experiment you ran six months ago? You're digging through spreadsheets, Slack messages, and random Google Docs, wondering why your team doesn't have a better system. That's exactly where experiment taxonomy comes in handy.

Think of it as creating a filing system for your experiments - but one that actually makes sense and helps you spot patterns you'd otherwise miss. Once you set it up right, finding insights becomes as easy as browsing Netflix categories instead of endlessly scrolling through an unorganized mess.

Understanding experiment taxonomy

Let's start with the basics. Experiment taxonomy is just a fancy way of saying "organizing your experiments into logical groups." It's like creating folders on your computer, but for test results and learnings.

The real magic happens when you can quickly answer questions like: "What pricing tests did we run last quarter?" or "Which onboarding experiments actually moved the needle?" Without a taxonomy, you're basically throwing darts in the dark and hoping to hit something useful.

Good taxonomies share a few traits. They use plain English (no jargon-heavy categories that newcomers won't understand). They're flexible enough to grow with your testing program. And most importantly, they reflect how your team actually thinks about experiments. The Reddit UX Research community has some great discussions about getting stakeholders involved early - because nothing kills a taxonomy faster than building it in isolation.

Here's what a solid taxonomy does for your team:

  • Creates a shared language (no more "that test we did with the thing")

  • Reveals testing gaps ("Wait, we've never tested checkout on mobile?")

  • Makes reporting actually useful

  • Helps new team members get up to speed quickly

The best part? When you connect your taxonomy to your data tools - whether that's a simple version control system or a full experimentation platform - you can start spotting trends automatically. Some teams even use AI tools to surface insights they might have missed.

Building an effective experiment taxonomy

Alright, so you're sold on the idea. Now what? Building a taxonomy from scratch feels daunting, but it's actually pretty straightforward if you break it down.

Start by gathering your experimentation team - and I mean everyone. Product managers, data scientists, engineers, designers. Get them in a room (or Zoom) and ask: "How do we currently think about our experiments?" You'll be surprised how different perspectives emerge. As folks in the prompt engineering community discovered, collaborative taxonomy building beats solo efforts every time.

Here's a practical structure that works for most teams:

Experiment type: Is it an A/B test, a gradual rollout, or a holdout? Each has different analysis needs.

Metrics that matter: Pick your north stars - conversion rate, revenue per user, engagement time. Just don't go overboard; tracking 50 metrics means tracking nothing well.

Who you're testing: New users behave differently than power users. Mobile users aren't desktop users. Be specific about your segments.

Product area: Homepage tests need different treatment than checkout experiments. Group by where the change lives.

The key is making your taxonomy flexible. Your testing program will evolve - what works for 10 experiments a month won't scale to 100. Plan for growth from day one. Regular reviews keep things fresh and relevant. Statsig's tracking best practices guide has solid advice on building taxonomies that scale.

One warning: don't overthink it. I've seen teams spend months debating the perfect taxonomy structure while their experiment data piles up unorganized. Start simple, iterate often. You can always refine as you learn what works.

Implementing taxonomy in experimentation workflows

So you've built this beautiful taxonomy. Now comes the hard part: getting people to actually use it.

The secret is integration. If your taxonomy lives in a dusty wiki page, it's dead on arrival. You need to bake it into your daily workflows. When someone sets up a new experiment, the taxonomy fields should be right there - required fields if possible. Clear documentation helps, but automation helps more.

Training matters too. Don't assume everyone understands why "Feature: Checkout - Mobile" is different from "Feature: Mobile - Checkout." Spend time explaining the logic. Show real examples of how proper tagging led to actionable insights. People adopt systems they understand the value of.

Here's what maintenance looks like in practice:

  • Monthly reviews of new experiments to ensure proper tagging

  • Quarterly taxonomy updates based on new patterns

  • Annual overhauls when your product strategy shifts

The right tools make everything easier. Whether you're using Statsig's experimentation platform or building something custom, look for:

  • Auto-tagging capabilities

  • Tag validation (catch typos before they spread)

  • Easy bulk updates

  • Clear reporting by taxonomy categories

Remember: your taxonomy is a living thing. It should grow and change as your experimentation practice matures. The taxonomy that got you from 0 to 100 experiments won't get you from 100 to 1000.

Leveraging taxonomy for enhanced insights

This is where the payoff happens. A well-implemented taxonomy transforms your experiment data from a junk drawer into a gold mine.

Take this real example: an e-commerce team I worked with couldn't figure out why their conversion experiments had such mixed results. Once they implemented a product category taxonomy, the pattern jumped out - luxury items responded completely differently to urgency messaging than everyday products. Without that categorization, they were averaging apples and oranges.

Good taxonomy enables meta-analysis. Instead of looking at experiments in isolation, you can ask bigger questions:

  • Which types of experiments consistently deliver results?

  • What segments respond best to personalization?

  • Where should we focus our testing efforts next quarter?

Marketing teams at Statsig use taxonomy to track campaign performance across channels, making it easy to shift budget to what actually works. Support teams categorize issues to spot trends before they become fires. The applications are endless once you have the structure.

But here's the real kicker: taxonomy accelerates learning across teams. When the growth team can easily find and learn from the product team's experiments, innovation spreads. Smart extrapolation of results becomes possible because you're comparing truly comparable things.

The best taxonomies become part of your team's vocabulary. "Let's look at all our onboarding experiments from Q3" becomes a five-minute exercise instead of a two-day archaeological dig. That efficiency compounds over time.

Closing thoughts

Building an experiment taxonomy isn't the most exciting part of experimentation - but it might be the most valuable. It's the difference between running experiments and running an experimentation program.

Start small. Pick your five most important categories and implement them consistently. Get buy-in from your team by showing quick wins. Then expand as you learn what works for your organization. The perfect taxonomy doesn't exist, but a good-enough taxonomy that everyone uses beats perfection every time.

Want to dive deeper? Check out how teams structure their experimentation programs in practice, or explore how modern experimentation platforms handle taxonomy at scale. Your future self (the one trying to find that crucial test result) will thank you.

Hope you find this useful!



Please select at least one blog to continue.

Recent Posts

We use cookies to ensure you get the best experience on our website.
Privacy Policy