Statsig supports SDKs in a number of different languages and frameworks, from popular client-side ones like React, JavaScript, iOS, and Android, to server-side integrations with Node.js, Ruby, Python, and Java.
Each of our SDKs has certain unit tests specific to their implementation—but all of our SDKs follow one of two main patterns:
A single-user, client-side SDK
gets all evaluations for a single user upfront, and then synchronously responds to gate/experiment checks.
A multi-user, server-side SDK
gets all definitions of gates/experiments upfront, and then responds to all gate/experiment checks by evaluating the gate/experiment locally.
Because they follow a similar format, we knew we could write a single test case that would run across each of them seamlessly. Our first iteration of this was our server SDK consistency test.
Given a test project in Statsig, the consistency test will evaluate all of its gates and experiments for a set of users, and ensure that the SDK behaves the same as the Statsig backend would. This approach gave us a solid foundation for SDK consistency across server SDKs, but required us to rewrite this test case in each language.
Kong is our Typescript-based write-once-run on every SDK framework. “Write once, run anywhere” is always a dream for programmers—that's why we have frameworks like React Native and Flutter (both of which we offer SDKs in 🙂 ). So, naturally, Daniel Loomb took this on.
First, he wrote a webserver for each of the languages we offer an SDK in, and integrated the Statsig SDK. He then exposed a standard set of endpoints that each webserver needs to be able to respond to, corresponding to the standard SDK methods: initialize, checkGate, getExperiment, and logEvent.
Now we can write a single test in Typescript, using the Jest framework, which can compose a set of calls to those endpoints to verify behavior across arbitrary test cases: Initialize the SDK with a certain set of options, log 10000 events, check a gate or two, and verify that all the event logs and exposure logs are correct for the given set of inputs, across all SDKs.
We run these tests when we make server changes that impact our API to ensure those changes won’t break our SDKs. We also run them when we make changes to each SDK, and before we publish a new version of each SDK.
We’re not done yet! Our next task is to extend these into a long-running webserver. Currently, each Kong test case spins up each webserver, makes a set of requests, and exits, but we’d like to ensure the long-term performance of each of our SDKs and verify there are no memory leaks or race conditions associated with long-running sessions as we continue to update our SDKs.
We’d also like to incorporate some more complex cases—spin up a server, check gates/experiments, use the console API to change some gates/experiments/create new ones, verify that we still evaluate all of those correctly, log the correct exposures across each SDK—and then do that again in an hour, all while shadowing production traffic to each of these webservers.
The Statsig <> Azure AI Integration is a powerful solution for configuring, measuring, and optimizing AI applications. Read More ⇾
Take an inside look at how we built Statsig, and why we handle assignment the way we do. Read More ⇾
Learn the takeaways from Ron Kohavi's presentation at Significance Summit wherein he discussed the challenges of experimentation and how to overcome them. Read More ⇾
Learn how the iconic t-test adapts to real-world A/B testing challenges and discover when alternatives might deliver better results for your experiments. Read More ⇾
See how we’re making support faster, smarter, and more personal for every user by automating what we can, and leveraging real, human help from our engineers. Read More ⇾
Marketing platforms offer basic A/B testing, but their analysis tools fall short. Here's how Statsig helps you bridge the gap and unlock deeper insights. Read More ⇾