Bayesian vs. frequentist statistics: Not a big deal?

Tue Feb 11 2025

Yuzheng Sun, PhD

Data Scientist, Statsig

We often get caught up in the technical details of statistics, losing sight of the bigger picture.

One common area of confusion and heated debate is the difference between Bayesian and Frequentist approaches

The debate sounds like a fundamental clash, but often, it's more about how we talk about uncertainty than the actual decisions we make based on data.

Let's explore this with a focus on why the differences often don't matter as much as some people make out of it.

The core confusion: What does "90% probability" really mean?

Imagine you're trying to figure out the average height of adults in your city. You collect some data and calculate a range of possible values.

  • Frequentist: A frequentist might say, "We calculated a 90% confidence interval of 5'6" to 5'9"." This sounds like there's a 90% chance the true average height is within that range, right? Not quite.

  • Bayesian: A Bayesian might say, "We calculated a 90% credible interval of 5'6" to 5'9"." This does mean there's a 90% chance the true average height is in that range, based on their model.

So, who's right? The answer is surprisingly simple: both, within their own frameworks. The difference stems from how they treat the idea of an "unknown" average height and what "probability" represents.

Thinking like a frequentist: It's all about the procedure

Frequentists see the world in terms of repeated experiments. Think of it like this:

  • The unknown is fixed: The true average height of adults in your city isn't changing while you're analyzing your data. It's a fixed, albeit unknown, number.

  • Randomness is in the data: The randomness comes from which people you happen to sample. If you repeated your survey many times, you'd get slightly different results each time.

  • Confidence intervals are about repetition: A 90% confidence interval means that if you repeated this entire process (collecting data and calculating the interval) many times, 90% of those intervals would contain the true average height.

Thinking like a Bayesian: It's all about beliefs

Bayesians take a different approach. They treat the unknown average height as something that can have a probability distribution.

  • The unknown is uncertain: Before you see any data, you might have some initial belief (a "prior") about the average height. Maybe you think it's probably around 5'7", but you're not sure.

  • Data updates beliefs: The data you collect updates this prior belief, leading to a "posterior" distribution. This posterior represents your updated understanding of the average height.

  • Credible intervals are about probability: A 90% credible interval means there's a 90% probability (based on your model and the data) that the true average height falls within that range.

Why the philosophies can seem to clash

The core difference is this:

  • Frequentists: Focus on the long-run frequency of events. Probability is about how often something would happen if you repeated the experiment many times.

  • Bayesians: Focus on the degree of belief or certainty about an unknown. Probability is a measure of how likely something is, given your current knowledge.

Do these differences actually matter in practice?

Here's the surprising part: often, not as much as you'd think!

  • Large samples: When you have a lot of data, Bayesian and Frequentist approaches tend to give very similar results. The data overwhelms any prior beliefs in the Bayesian approach.

  • Uninformative priors: If a Bayesian uses a "flat" or "uninformative" prior (meaning they don't have strong initial beliefs), the results often align closely with Frequentist methods.

  • Real-world decisions: Imagine you're testing two versions of a website (A/B testing).

    • A Frequentist might see if a 95% confidence interval for the difference in conversion rates excludes zero.

    • A Bayesian might see if a 95% credible interval for the difference lies entirely above zero.

    • In most cases, they'll reach the same conclusion about which version is better.

A note on Bayesian with informative priors 

Bayesian methods with informative priors are one of the few areas where different approaches can lead to different decisions and business outcomes. In theory, they offer several advantages:

  1. Faster, more accurate decision-making

  2. The ability to leverage past information

  3. A structured way to debate underlying assumptions

Because of these benefits, some advocate for their adoption including data scientists at companies like Amazon and Netflix (ref).

However, in practice, Bayesian methods with informative priors can be risky. Due to principal-agent problems and a general bias toward positive results, they can be misused to manipulate experiment outcomes while maintaining the appearance of scientific rigor. A skilled data scientist equipped with this method can almost conjure any result. My discussion with Dr. Kenneth Huang explores these risks in both mathematical and practical terms.

We plan to roll out Bayesian with informative priors soon, but we’ll also provide tools for oversight — such as enabling experimentation teams (e.g., centers of excellence) to enforce disciplined, well-reasoned priors. We will publish a more detailed post with the feature launch. Here, I want to caution against using this approach without carefully considering its secondary effects. 

The bottom line: It's more about “interpretation”

  • Frequentist confidence intervals: Tell you about the long-run performance of your method. They don't make probability statements about a specific interval.

  • Bayesian credible intervals: Allow you to make direct probability statements about the unknown parameter, based on your model and the data.

Both approaches are valid and useful. The choice often comes down to:

  • Your comfort level with priors: Are you comfortable incorporating prior beliefs into your analysis?

  • How you want to communicate: Do you prefer to talk about long-run frequencies or direct probabilities?

  • Your field's conventions: Some fields have strong traditions favoring one approach over the other.

  • Risk tolerance: Bayesian is good if the cost to ship is low, or the risk of shipping something bad is low, because you will more quickly move in the right direction than if you only ship with p<0.05

In the end, the Bayesian vs. Frequentist debate is largely philosophical. While the interpretations differ, the practical implications are often minimal. 

Bayesian is not introducing any new information. Both methods observe means and standard deviations from different test groups. Focus on understanding the assumptions of each approach and choosing the one that best fits your specific situation and communication goals. If you are not sure, I have two specific advices:

  1. Use frequentists for the sake of simplicity to reduce communication overhead.

  2. In either approach, think about your decision as a bet – Leaders often require operating under uncertainties. The job of data scientists is to estimate risks and probabilities, then make a recommendation. The quality of the decision is what matters.

Don't get bogged down in the "war". Understand the theoretical debate, but focus on the business outcome.

Request a demo

Statsig's experts are on standby to answer any questions about experimentation at your organization.
request a demo cta image


Try Statsig Today

Get started for free. Add your whole team!
We use cookies to ensure you get the best experience on our website.
Privacy Policy