These questions range from the actually problematic (a stakeholder asking if you can do a drill down to find a “win” in their experiment results) to great questions that just require some mental bandwidth to answer well and randomize you from the work you were planning on doing.
In my two years at Statsig, I’ve totally changed direction on this. Questions are—besides my excellent teammates—probably the most valuable currency we have here.
Unexpectedly, as I’ve dove into a significantly more technical role than in previous jobs, I’ve gotten more and more value from engaging with stakeholder questions, teaching, and often being proven wrong.
The stakeholders I’m usually working with at Statsig don’t work at Statsig.
As we’ve continuously scaled our product and customer base, we’ve doggedly held onto our ethos of deeply engaging with customer questions and providing the best support we possibly can. This means that everyone pitches in—from our Sales Engineers and Sales Reps to our back-end Infrastructure Engineers and Data Scientists.
This is obviously expensive, but even as we’ve scaled and have had to build systems to properly support our current customer base, we’ve continued to believe that this is worth it.
Most of the questions we get at Statsig come from data teams or some company's dedicated experimentation teams. These teams have their own stakeholders, and are responsible for maintaining their team’s experimentation infrastructure on our platform.
Because of that, questions—and learnings—go both ways. Instead of acting as an authority on “how to experiment,” our approach has transitioned to an opinionated curiosity, with a huge willingness to learn. People reciprocate, and so support questions become opportunities for us to educate each other versus a tax on our time.
At multiple times in Statsig’s development, customers have come to us with use cases that we didn’t currently support and had valid reasons for pushing back on, but eventually built a compromise solution that solved their problem because we were able to engage with them and understand the “why” behind the ask.
In some cases, we’ve pushed back—we have a responsibility as a data platform to not encourage bad practices—but in many cases, we’ve realized there’s a compromise that worked for that partnership and future partnerships.
This, funnily enough, leads to a reinforcing loop; as we demonstrate a willingness to collaborate, people have more product asks. This is a good thing.
Similarly to running experiments, working with customers gives us a tight feedback loop as we continue to develop Statsig. This happens in multiple avenues; the most obvious is getting direct product feedback on what works/doesn’t work, but we’ve also found that:
Hopping on a call and having a customer walk us through their workflow highlights needs or missing functionality that we wouldn’t have discovered in our own “use-as-designed” workflows
Working with highly educated, experienced, and professional partners means we learn a lot. I think we know quite a bit about experimentation here, but I don’t believe I know more than everybody. In multiple instances, we’ve been able to ship improvements to our product for all of our customers, just because one customer challenged us
Having a tight partnership means that bug reports or feature requests are well-structured—to the point where people will prioritize and do mocks for asks and tell us that half of them are just “nice-to-haves.” This means that the increase in product asks above is dramatically more efficient
The partnership approach also allows us to operate with trust. With established partners, I frequently do product mocks by editing chrome-inspector HTML to share what I’m thinking of building. The first few times, this felt a bit —silly? Or unprofessional?—but it also meant that we could ship smaller features the same day it was requested. It’s a standard MO for me, now.
This is a significant tax on our time. As a data scientist and developer, I spent hours each week on customer calls and regularly flew as far as Australia or Germany to do working sessions with customers.
In addition to the ongoing dialogue we try to foster, we’ve learned some permanent lessons from operating in this mode that I think steer how we operate—hopefully in a way people get value from!
One painful lesson for me has been that misunderstandings are functionally bugs for customers—whether they see it that way or not. If something “looks wrong,” it usually means there’s something wrong with how we’re rendering it or we’re missing a critical warning or explanation.
When I see a customer question more than once, this is a near-instant trigger for me to update documentation and reach out to the design team.
At Statsig we love doing early releases of prototype features to get features out the door (we like to use the term “eng design”), but critically we try to use this as a forum to get feedback from that early, opt-in set of users on our platform to make sure they understand what we’re doing.
This has been especially important as part of the Statsig Warehouse Native team’s process since we’re tightly integrating with customers’ data layer!
Open dialogues around timelines, blockers, and misgivings about our feature lead to customers helping you unblock them. It’s magical to see a Slack channel transform from reports to opinionated asks as customers respond to this approach.
We’re still figuring this out. I’ve definitely gotten burned by overpromising and having to burn some midnight oil delivering on that promise, but it’s infinitely preferable to the alternative!
Something I’ve observed about the experimentation space is there’s a lot of dogma, appeals to authority, and blind statements of superiority. Because there is so much that can go wrong in experimentation, people are (understandably) conservative, and many vendors in the space position themselves as ultimate authorities in order to appeal to that desire to not make mistakes.
In many cases, this leads to false negatives. In multiple partnerships, we’ve had customers who were evaluating us and other vendors—particularly in the warehouse native space—tell us that someone told them they were incorrect for wanting a feature that was actually statistically and analytically valid.
Some of our funnel functionality and ID stitching falls into this bucket. These features are easy to mess up with, but from my perspective that’s a huge part of the the value of a testing platform—we can help prevent users away from making mistakes that would lead to poor results and bad decisions.
Embracing an attitude of humility is something the data team at Statsig treats as critically important. Even the best teams can make the wrong call, and being open to dialogue around solving people’s problems has been a huge part of us being a comprehensive and robust platform for experimentation. While we take our responsibility around governing a data platform seriously, and have strong opinions in our design, we’re always willing to listen.
Statsig's biggest year yet: groundbreaking launches, global events, record scaling, and exciting plans for 2025. Explore our 2024 milestones and what’s next! Read More ⇾
A guide to reporting A/B test results: What are common mistakes and how can you make sure to get it right? Read More ⇾
Understand the difference between one-tailed and two-tailed tests. This guide will help you choose between using a one-tailed or two-tailed hypothesis! Read More ⇾
This guide explains why the allocation point may differ from the exposure point, how it happens, and what you to do about it. Read More ⇾
From continuous integration and deployment to a scrappy, results-driven mindset, learn how we prioritize speed and precision to deliver results quickly and safely Read More ⇾
The Statsig <> Azure AI Integration is a powerful solution for configuring, measuring, and optimizing AI applications. Read More ⇾