Platform

Resources

Docs Blog Pricing

Platform

Resources

Platform

Resources

How to calculate true positive rate in experiments

Tue Jan 14 2025

Ever wondered how effective your experiments really are? Whether you're developing a new feature or testing a hypothesis, understanding how well your models identify positive cases is crucial. That's where the true positive rate (TPR) comes into play.

In this blog, we'll dive into what TPR is, why it matters in experiments, and how you can calculate and maximize it. Let's demystify this important metric and see how it can help you make better, data-driven decisions.

Understanding true positive rate in experiments

Let's talk about the true positive rate (TPR)—it's a key metric when you're trying to figure out how effective your experiments are. Simply put, TPR tells you how good your model is at catching the positives out of all the actual positive cases.

So, how do we calculate it? The formula is straightforward: TPR = TP / (TP + FN). Here, TP (true positives) is the number of times your model correctly said "Yes," and FN (false negatives) is when it missed a positive case. By using this formula, you can see how many positives you got right out of all the actual positives.

But how does this look in practice? To crunch the numbers, start by grabbing your counts of true positives and false negatives from your experiment's results. Then, just plug them into the formula. You'll get a value between 0 and 1, which you can multiply by 100 to turn into a percentage—that way, it's easier to gauge how well you're doing.

Understanding TPR is super important when you're working with models that make "yes" or "no" decisions. If your TPR is high, it means your model is great at finding the positives. If it's low, well, there's room for improvement. By keeping an eye on your TPR, you can make sure your experiments are giving you solid, reliable results. Tools like Statsig can help you monitor and optimize your TPR, ensuring your experiments yield accurate insights.

The role of TPR in relation to other metrics

Now, let's bring in the confusion matrix—it's like a scoreboard for your model's predictions. It breaks down predictions into true positives, false positives, true negatives, and false negatives. From this, we get both the true positive rate (TPR) and the false positive rate (FPR). While TPR shows us how good we are at catching the positives, FPR tells us how often we're mistakenly flagging negatives as positives.

That's why balancing TPR and specificity (also known as the true negative rate) is key. Specificity tells us how good we are at correctly identifying negatives. By tweaking the classification threshold, you can find the sweet spot between TPR and specificity, depending on what's more important for your situation.

To help visualize this trade-off, we use the Receiver Operating Characteristic (ROC) curve. It plots TPR against FPR at various thresholds. The Area Under the Curve (AUC) gives us a single number to represent overall performance—the higher, the better.

Understanding how TPR, FPR, and specificity interplay helps you make smarter decisions when looking at your experiment's results. Using techniques like CUPED and sequential testing, you can reduce variance and keep false positive rates in check. That way, you get more reliable insights from your experiments. At Statsig, we provide tools that help you visualize these metrics and make sense of them in the context of your experiments.

Step-by-step calculation of true positive rate

Let's dive into an example to see how to calculate TPR step by step. First, you need the numbers for true positives (TP) and false negatives (FN) from your confusion matrix. Basically, you compare what your model predicted against what actually happened in your dataset.

Imagine you ran an experiment with 100 samples. Your model correctly identified 80 positive cases—that's your TP. But it missed 20 positive cases, misclassifying them as negative—that's your FN. Plugging these into our formula:

TPR = TP / (TP + FN) = 80 / (80 + 20) = 0.8

So, your TPR is 0.8, or 80%. That means your model correctly catches 80% of the actual positives. Expressing TPR as a percentage makes it easier to grasp how well your model is doing at spotting positive cases.

But remember, TPR is just one part of the story. You should also look at metrics like the false positive rate (FPR) to get a full picture of your model's effectiveness. By optimizing TPR (without letting FPR get out of hand), you make sure your model reliably finds the positives—super important in fields like medical diagnostics or fraud detection.

Best practices for maximizing true positive rate in experiments

So, how can you make sure you're getting the most out of TPR in your experiments? First off, watch out for imbalanced datasets—they can really mess with your TPR calculations. If your dataset has way more negatives than positives (or vice versa), your TPR might not tell the whole story. To avoid this, try to have a balanced mix of positive and negative instances.

Another tip is to use variance reduction techniques like CUPED. CUPED uses data from before the experiment to cut down on variance and pre-exposure bias. This leads to tighter confidence intervals and lower p-values—good news for your TPR! Also, consider sequential testing; it helps control false positives, especially if you're peeking at your results before the experiment is officially over.

Techniques like stratified sampling and cross-validation are also your friends. Stratified sampling makes sure each class (positive and negative) is properly represented when you're splitting your data. Cross-validation lets you test your model's performance across different slices of data, helping you build a more robust model and get a more accurate TPR.

Also, don't just stop at the binary outcomes. Look at the distribution of treatment effects, because sometimes the story is in the details. Using methods like empirical Bayes can help you estimate effects more efficiently, giving you a clearer picture of how your model performs across different segments. Understanding these nuances helps you make better decisions and tweak your model to improve TPR.

Finally, remember that boosting TPR isn't just about tweaking the model. It's also about designing your experiments thoughtfully and picking the right metrics. Choose metrics that match your business goals and give you a full view of how your model is doing. By focusing on what matters and using smart techniques to calculate TPR, you can unlock valuable insights and make better decisions for your team or organization.

At Statsig, we offer tools and features that help you implement these best practices effortlessly. With our platform, you can utilize techniques like CUPED and sequential testing without getting bogged down in the technical details.

Closing thoughts

Understanding and optimizing the true positive rate is key to running effective experiments and building reliable models. By paying attention to TPR and balancing it with other important metrics like FPR and specificity, you can gain deeper insights and make better, data-driven decisions. Remember to utilize techniques like CUPED, sequential testing, and cross-validation to enhance your results.

If you're looking to dive deeper or need tools to help manage your experiments, check out Statsig's resources and platform—we've got you covered. Hope you found this useful!

Permalink: https://www.statsig.com/perspectives/calculate-true-positive-rate

Platform

Resources

Platform

Resources

Docs

Blog

Pricing

Back to Perspectives home

The Statsig Team

How to calculate true positive rate in experiments

Understanding true positive rate in experiments

The role of TPR in relation to other metrics

Step-by-step calculation of true positive rate

Best practices for maximizing true positive rate in experiments

Closing thoughts

Recent Posts

How Statsig lets you ship, measure, and optimize AI-generated code

Sid Kumar, Brock Lumbard

Your users are your best benchmark: a guide to testing and optimizing AI products

Skye Scofield

The more the merrier? The problem of multiple comparisons in A/B Testing

Allon Korem, Oryah Lancry-Dayan

Randomization: The ABC’s of A/B Testing

Allon Korem, Oryah Lancry-Dayan

Speeding up A/B tests with discipline

Yuzheng Sun, PhD

You can have it all: Parallel testing with A/B tests

Allon Korem, Oryah Lancry-Dayan