Weâre excited to share a limited beta of Metrics Explorer : an analytics surface with powerful slicing for metrics. Break out a metric by device type, country or user tier. Explode a ratio metric and see how the numerator and denominator have moved.
Get data you can trust and insights you need to take action and drive growth. Find this under Metrics -> Explore
The Users tab enables you to diagnose issues for specific users, by helping answer questions like "which experiment group was this user in?" Or "when did the user first see this feature?" We've just upgraded the backend for this - lookups now take ~5 seconds, instead of ~10 minutes.
We've just started rolling out the ability to apply targeting on Holdouts. Holdouts work by "holding-back" one set of users from testing and comparing their metrics with normal users. Statsig now lets you apply a Feature Gate to your Holdout. e.g. if you wanted an iOS User Holdout, you could apply a Feature Gate that passes only iOS users.
Holdouts are the gold standard for measuring the cumulative impact of experiments you ship. (Learn more)
As teams have grown their Statsig usage, so has old experiment clutter. A few months back we launched a suite of tooling to manage the lifecycle of your feature flags, and today weâre rolling out automated clean-up logic for old experiments as well.
Starting this week, Statsig will be setting a default Pulse Results compute window of 90 days for all new experiments going forward, after which your Pulse Results will stop being computed. Please note this only applies to experiments, not feature gates, holdouts, or any other config types.
You will be able to extend this window at the individual experiment level as you approach the 90-day cap, and your user assignment will not be impacted even if results stop being computed. Read more in our docs.
In the coming days, experiment owners of impacted experiments will receive an email notification and 14 days to extend the Results compute window, if you wish to. As always, donât hesitate to reach out if you have any questions- our hope is that this both cleans up your Console and saves teams money long-term!
Have you ever set up a relatively complex Custom Metric and then realized you want another similar metric but with a slight tweak? Yep, we have too! To make that process easy, today weâre introducing the ability to clone Custom Metrics.
To clone a Custom Metric, go to the "âŚ" menu in a metric page, then select âClone.â You will have the opportunity to name your new metric, add a description and tags, and then we will auto-fill all the inputs of the metric definition from the source metric. Customize to your liking and you're good to go!
Happy Friday, Statsig Community! To cap off a beautiful week here in Seattle âď¸, we have a number of exciting launch updates to share:
Todate, when you launch a new feature roll-out or experiment, you have to wait 24 hours to start seeing your Pulse results. Today, weâre very excited to shorten that time significantly with the launch of more real-time Pulse. Now, you will see Pulse results start to flow through within 10-15 minutes of starting your roll-out or experiment.
A few things to consider-
For the first 24 hours, results do not include confidence intervals; early metric lifts are meant to help you ensure that things are looking roughly as expected and verify the configuration of your gate/ experiment, NOT make any launch decisions
The Pulse hovercard view will look a bit different; time-series and top-line impact estimates will not be available until the first 24-hour daily lift calculation
At some companies, an user may have a different ID in different environments and hence want to specify the environment to override a given ID in. To enable this, weâve added the ability to specify target environment for Overrides in Experiments. For Gates, you can achieve this via creating an environment-specific rule.
(vs. Strictly Time Duration)
Weâre introducing more flexibility into how you can measure & track experiment target duration. Now, you can choose between setting a target # of days or a target # of exposures an experiment needs to hit before a decision can be made.
To configure a target # of exposures, tap âAdvanced Settingsâ in Experiment Setup tab, then under âExperiment Measured Inâ select âExposuresâ (vs. âDaysâ). The progress tracker at the top of your experiment will now show progress against hitting target number of exposures.
See our docs for more details.
Statsig manages randomization during experiment assignment. In some B2B (or low scale, high variance cases) the law of large numbers doesnât work. Here it is helpful to manually assign users to test and control to ensure both groups are comparable. Statsig now lets you do this. Learn More
What is Stratified Sampling?
Stratified sampling is a sampling method that ensures specific groups of data (or users) are properly represented. You can think of this like slicing a birthday cake. If sliced recklessly, some people may get too much frosting and others will get too little. But when sliced carefully, each slice is a proper representation of the whole. In Data Science, we commonly trust random sampling. The Law of Large Numbers ensures that a sufficiently-sized sample will be representative of the entire population. However, in some cases, this may not be true, such as:
When the sample size is small
When the samples are heterogeneous
We gave our Warehouse Ingestion tab a total makeover so that you can have better visibility into your import status! Some key improvements include:
A simple visual display to track your import progress, with an extended date range
Verify your imported data with ease and confidence using our import volume chart and data samples
Take actions more easily and stay in control of your imports (use the ââŚâ menu), whether you want to trigger a backfill or edit your daily ingestion schedule
Weâve heard from some folks that they want to explore metrics even outside an experimentâs context. Weâve just started adding capabilities to do this. Now, when youâre looking at a metric in the Metrics Catalog you can:
compare values to a prior period to look for anomalies
apply smoothing to understand trends
look at other metrics at the same time to see correlation (or lack thereof) group by metric dimensions
save this exploration as a Dashboard to revisit/share with others
view current experiments and feature rollouts that impact this metric (also in Insights)
This starts rolling out March 31.
Now, you can specify which audience you want to calculate experimental power for, by selecting any existing Feature Gate via the Power Calculator.
To do this, go to the Power Calculator (either under âAdvanced Settingsâ in Experiment creation or via the âTools & Resourcesâ menu) and select âPopulationâ.
This will kick off an async power calculation based on the selected targeting gateâs historical metric value(s), and you will be notified via email and Slack once your power analysis is complete.