Platform

Developers

Resources

Pricing

Frequently Asked Questions

A curated summary of the top questions asked on our Slack community, often relating to implementation, functionality, and building better products generally.
Statsig FAQs
EXPERIMENTS ANALYTICS

How are p-values of experiments calculated and is it always assumed that the underlying distribution is a normal distribution?

In the context of hypothesis testing, the p-value is the probability of observing an effect equal to or larger than the measured metric delta, assuming that the null hypothesis is true. A p-value lower than a pre-defined threshold is considered evidence of a true effect.

The calculation of the p-value depends on the number of degrees of freedom (ν). For most experiments, a two-sample z-test is appropriate. However, for smaller experiments with ν < 100, Welch's t-test is used. In both cases, the p-value is dependent on the metric mean and variance computed for the test and control groups.

The z-statistic of a two-sample z-test is calculated using the formula: Z = (Xt - Xc) / sqrt(var(Xt) + var(Xc)). The two-sided p-value is then obtained from the standard normal cumulative distribution function.

For smaller sample sizes, Welch's t-test is the preferred statistical test due to its lower false positive rates in cases of unequal sizes and variances. The t-statistic is computed in the same way as the two-sample z-test, and the degrees of freedom ν are computed using a specific formula.

While the normal distribution is often used in these calculations due to the central limit theorem, the specific distribution used can depend on the nature of the experiment and the data. For instance, in Bayesian experiments, the posterior probability distribution is calculated, which can involve different distributions depending on the prior beliefs and the likelihood.

It's important to note that it's typically assumed that the sample means are normally distributed. This is generally true for most metrics thanks to the central limit theorem, even if the distribution of the metric values themselves is not normal.

Join the #1 Community for Product Experimentation

Connect with like-minded product leaders, data scientists, and engineers to share the latest in product experimentation.

Try Statsig Today

Get started for free. Add your whole team!

What builders love about us

OpenAI OpenAI
Brex Brex
Notion Notion
SoundCloud SoundCloud
Ancestry Ancestry
At OpenAI, we want to iterate as fast as possible. Statsig enables us to grow, scale, and learn efficiently. Integrating experimentation with product analytics and feature flagging has been crucial for quickly understanding and addressing our users' top priorities.
OpenAI
Dave Cummings
Engineering Manager, ChatGPT
Brex's mission is to help businesses move fast. Statsig is now helping our engineers move fast. It has been a game changer to automate the manual lift typical to running experiments and has helped product teams ship the right features to their users quickly.
Brex
Karandeep Anand
CPO
At Notion, we're continuously learning what our users value and want every team to run experiments to learn more. It’s also critical to maintain speed as a habit. Statsig's experimentation platform enables both this speed and learning for us.
Notion
Mengying Li
Data Science Manager
We evaluated Optimizely, LaunchDarkly, Split, and Eppo, but ultimately selected Statsig due to its comprehensive end-to-end integration. We wanted a complete solution rather than a partial one, including everything from the stats engine to data ingestion.
SoundCloud
Don Browning
SVP, Data & Platform Engineering
We only had so many analysts. Statsig provided the necessary tools to remove the bottleneck. I know that we are able to impact our key business metrics in a positive way with Statsig. We are definitely heading in the right direction with Statsig.
Ancestry
Partha Sarathi
Director of Engineering
We use cookies to ensure you get the best experience on our website.
Privacy Policy