We've refreshed the layout of the Experiment Setup page, which now includes a TEST button containing helpful resources to validate your experiment setup. The Advanced Settings section has also been reorganized into Analysis Configuration and Experiment Population categories, with enhanced documentation links for users wanting to learn more about each feature.
You can now break down funnel performance by two properties at once, giving you a clearer view of how different user segments progress through each step.
Apply up to two group-by properties in a funnel (e.g. country and platform)
View combined breakdowns like US / iOS, Canada / Android, etc.
Analyze performance across more specific cohorts in a single chart
Once you apply a group by in your funnel chart, youâll now have the option to add a second property. The chart will display the top combinations of those properties, ranked by event volume in the first step of the funnel.
For example, if you group by platform and experiment variant, the chart will show funnels for the most common combinations like Android / Treatment or iOS / Control.
With support for two group-bys, you can run more detailed comparisons without duplicating charts or manually applying filters. This is especially useful for spotting performance differences across dimensions like geography, device type, or experiment conditionsâall in one view.
You can now format the y-axis in time series charts within Metric Drilldown, giving you better control over how values are displayed.
Choose from multiple y-axis formats:
Number (default)
Percentage
Time (e.g. seconds, minutes)
Decimal Bytes (e.g. kB, MB)
Bytes
Bits
Apply formatting to better match the metric youâre analyzing
When editing a time series chart in Metric Drilldown, youâll see a new y-axis formatting option. Select the format that best fits your metricâfor example, use percentage for conversion rates or time for session duration.
This gives you more clarity when interpreting trends, especially for metrics like load time, bandwidth, or success rates. Instead of manually translating values, you can now visualize them in context directly on the chart.
We're adding the ability to quickly copy metrics used in one experiment to a new experiment.
It's easy to select metrics for an experiment in Statsig, and tags or templates are powerful tools to manage collections. Sometimes, though, you already set up the perfect measurement suite on another experiment and just want to copy that - and now you can!
This is especially powerful for customers using local metrics - experiment-scoped custom metric definitions - since you can copy those between experiments without needing to add them to your permanent metric catalog.
Generally, experimentalists make decisions by comparing means and using standard deviation to assess spread. There's exceptions, like using percentile metrics, but the vast majority of comparisons are done in this way.
It's effective, but it's also well known that means mask a lot of information. To help experimentalists on Statsig understand what's going on behind the scenes, we're adding an easy interface to dig into the distributions behind results.
Here, we can see a pulse result showing a statistically significant lift in revenue for both of our experimental variants.
By opening the histogram view (found in the statistics details), we can easily see that this lift is mostly driven by more users moving from the lowest-spend bucket into higher buckets
This is available today on Warehouse Native - and we're scoping out Statsig Cloud.
We're providing more control over when experiment results load in Warehouse Native - in addition to schedules and API-based triggers, customers can also specify days of the week to load results on for a given experiment or as an organizational default.
In addition, org-level presets for turbo mode and other load settings will help people keep their warehouse bill and load times slim! Read more at https://docs.statsig.com/statsig-warehouse-native/guides/costs
Many customers on Statsig run hundreds of concurrent experiments. On Warehouse Native, this means that interactive queries from the Statsig console can run slowly during peak hours for daily compute
Now, users on Snowflake, Databricks, and Bigquery can specify separate compute resources for 'interactive' console queries vs. scheduled 'job' queries - meaning the interactive queries will always be snappy. This also means a large compute resource used for large-scale experiment analysis won't get spun up when running small interactive queries like loading data samples.
For those warehouses, we've also added the ability to specify different service accounts for different "Statsig Roles" within the Statsig roles system. This means that the scorecard service account has the necessary access to user data for calculating experiment results, but customers can specify privacy rules like masking for sensitive fields to prevent access to sensitive data through interactive queries in the Statsig console.
Three exciting new improvements to our recently launched Topline Alerts product-
Embed variables in your alert message- You can now insert the event name, value of the alert, warn threshold, alert threshold, and (soon) the value you've grouped your events by directly into your notification text body to provide more context when viewing alert notifications.
Test your notification manually- You can now trigger each state of your alert (Raise, Warn, Resolve, No Data) to ensure your alert is configured as desired at setup time.
View Evaluated vs. Source data- In the "Diagnostics" tab of your alert, you can now toggle between Evaluated (aggregated data for the end alert evaluation) and Source modes (underlying event data pre-aggregation, used as input to your alert calculation). While Evaluated data mode is still restricted to a 24 hr event window, you can look back further for Source data to get a sense of how the event you're alerting on has trended over a longer window.
Surrogate metrics are now available as a type of "latest value" metric in Warehouse Native.
Surrogate metrics (also called proxy or predictive metrics) enable measurement of a long-term outcome that can be impractical to measure during an experiment. However, if used incorrectly adjustment, the false-positive rate will be inflated.
While Statsig won't create surrogate metrics for you, when you've created one you can input the mean squared error (MSE) of your model, so that we can accurately calculate p-values and confidence intervals that account for the inherent error in the predictive model.
Surrogate metrics will have inflated confidence intervals and p-values compared to the same metric without any MSE specified.
Learn more here!
You can now compare up to 15 groups in funnel charts when using a group by, up from the previous limit of 5.
Select and compare up to 15 groups in a funnel analysis
Use a new selector to control exactly how many groups to display
Once you apply a group by (e.g., browser, country, experiment variant), a group count selector appears. Use it to choose how many top groups to include based on event volume.
This gives you more flexibility to analyze performance across more segmentsâespecially helpful for large experiments, multi-region launches, or platform-specific funnels.
Let us know how this works for your use caseâweâre always looking to improve.