However, Sample Ratio Mismatch (SRM) can sometimes occur in setups like this, leading to uneven splits in user groups. For instance, in an experiment targeting a 50/50 split between control and test groups, a company might expose 1,000 users. Instead of 500 users in each group, Statsig may only receive data for 200 in control and 500 in test—a roughly 28/72 split.
Why does this happen?
Issues like website crashes when serving the control version could prevent the SDK from sending exposure events to Statsig.
Currently, Statsig provides debugging tools that help identify suspect dimensions passed through the SDK. For example, if most control exposures come from the US while the test group is evenly split between EU and US, the issue might be linked to the SDK in the EU release.
However, these tools have been limited to analyzing a preset list of dimensions, such as sdk_type
, browser
, country
, and os
.
Having SRM in an experiment is highly problematic. It skews experiment results and makes them unreliable, rendering the findings invalid. Debugging SRM is crucial, especially for customers with complex release setups where pinpointing the source of the issue can be challenging.
The ability to analyze additional, custom dimensions provides much-needed granularity and flexibility, enabling customers to diagnose and resolve SRM more effectively.
We’ve expanded our SRM debugging capabilities to allow customers to define custom user dimensions for analysis. With this update, Statsig will run its SRM Analysis on these custom dimensions, providing deeper insights tailored to individual customer needs.
Step 1: In your project settings, list the custom dimensions you want to analyze.
Step 2: Navigate to the Diagnostics tab and open the Experiment Health Checks.
Step 3: Use the SRM Debugger to review group metrics. This tool highlights any custom dimensions that are likely contributing to SRM issues.
With this added flexibility, customers can debug SRM with precision, ensuring their experiments produce trustworthy results.
Find out how we scaled our data platform to handle hundreds of petabytes of data per day, and our specific solutions to the obstacles we've faced while scaling. Read More ⇾
Building a scalable experimentation platform means balancing cost, performance, and flexibility. Here’s how we designed an elastic, efficient, and powerful system. Read More ⇾
The debate between Bayesian and frequentist statistics sounds like a fundamental clash, but it's more about how we talk about uncertainty than the actual decisions we make. Read More ⇾
Here's how we optimized store cloning, cut processing time from 500ms to 2ms, and engineered FastCloneMap for blazing-fast entity updates. Read More ⇾
It's one thing to have a really great and functional product. It's another thing to have a product that feels good to use. Read More ⇾
Stratified sampling enhances A/B tests by reducing variance and improving group balance for more reliable results. Read More ⇾