It goes deeper than just the content that you see, and the ranking algorithms backings those — the buttons you have to click, the way a post with multiple photos shows up, or even the shade of the blue might be different.
At Statsig, we’ve been working to help any company run these same types of experiments. Sometimes, it’s as simple as showing you one version, and me another and comparing our behavior after the fact. But other times, the experiment you want to run doesn’t fit in to this per-person model. If you look closely at your facebook app, even within your own app session, you might notice subtle differences between two groups, or two pages.
You could chalk it up to different types of groups — maybe one is a “Buy Nothing” group, and the other is a group of your college roommates. But if you look closer still, and you were in the right set of groups, you could likely even find differences within the same type of group. Whats going on there?
Facebook (and many other companies) rollout features or experiments at the user level, which may explain why your blue is different than your friends' blue. But they also roll out features at the group, page, event, company, workspace, organization, etc. level, meaning the same user gets a different experience two different entities of the same type.
For example, one of the first features I worked on that was gated to specific groups was a special post creation experience. I worked on a way to help people craft posts for buying and selling vehicles within groups — and I started by only allowing my test group ID to access that experience. Before I got the chance to test this at the group level, my team shifted its focus to building out the Marketplace experience. But I recently revisted my test group to find the experience was ultimately shipped!
Splitting a population based on the userID may not be the right decision for every experiment, like an experiment on a Facebook Group. If you need to guarantee that certain groups of people get the same experience in an A/B test, you likely need to split on a different unit type than the user. The unit type should match the population that needs a consistent experience in order for your test to work.
Facebook made it super easy for engineers to do this. Need to check which variant a user is in based on the current user? Use GK::forVC(). Need to get which variant a different entity type is in? Use GK::forID().
Outside of Facebook, you see this pattern in many different enterprise tools. For example, Figma and Notion have different workspaces, and may want to roll out features or experiments across those, for all users in a given workspace. Amplitude may want to experiment across different organizations, or at the project level. UIPath might want to test a feature per project, or company.
With these use cases in mind, we built Custom IDs for Statsig. Now, your gates and experiments can use different id types for stability and evaluation, so everyone in Group/Company/Organization/Project A will continue to see the same experience. These checks can live side by side with your user ID-based gates and experiments.
Trying to experiment on a different unit type than a “user”? Or just miss GK::forID()? Try out customIDs on Statsig.
Thanks to our support team, our customers can feel like Statsig is a part of their org and not just a software vendor. We want our customers to know that we're here for them.
Migrating experimentation platforms is a chance to cleanse tech debt, streamline workflows, define ownership, promote democratization of testing, educate teams, and more.
Calculating the right sample size means balancing the level of precision desired, the anticipated effect size, the statistical power of the experiment, and more.
The term 'recency bias' has been all over the statistics and data analysis world, stealthily skewing our interpretation of patterns and trends.
A lot has changed in the past year. New hires, new products, and a new office (or two!) GB Lee tells the tale alongside pictures and illustrations:
A deep dive into CUPED: Why it was invented, how it works, and how to use CUPED to run experiments faster and with less bias.