This makes it so that you can use your existing events and metrics with Statsig’s experimentation engine. We’re launching with the ability to sync data from Snowflake, BigQuery, Redshift, and Databricks, and we are excited to add more as needed.
As a Statsig user, you will be able to use our powerful stats engine and console experience on top of your existing data, giving you experiment results, feature gate measurement, and diagnostics on the events and metrics your team is already using.
Statsig is a full-stack platform for product experimentation and observability. Alongside experimentation and feature gate tooling, our SDK provides logging tools to allow customers to track their product performance with no need for any other tools.
While our SDK provides this powerful suite of tools for logging and analyzing events and metrics, many companies already have a well-established data organization and rely on internal datasets to track and measure their products. Our users have expressed that recreating and validating complex or critical metrics can be tedious.
We have existing tools to import metrics and events from data warehouses, but these put the onus on customers to create datasets in specific formats and manage the scheduling of the imports. This was very manual, created many points of failure, and we didn’t provide an easy way for customers to fix or backfill data once they’d imported it to Statsig.
Based on the previous pain points, we built the new approach with the following goals in mind:
Quick and easy set-up: Once you have your connection details on hand, it takes less than 5 minutes to get started
Set it and forget it: We’ll take care of keeping import data in sync, and proactively look for and report any issues with your data in Statsig
Keep things consistent: We’ll treat your imported data just the same as SDK data, materializing into experiment results, creating tracking datasets, and eventually allowing you to explore it in tools like Events Explorer.
Here’s what you can do:
In your Statsig metrics page, you’ll be able to find the new “Ingestions” tab
Here, you can give us connection information for one of the supported data warehouses
You’ll give us a SQL snippet that provides a view for your base metric or event data. This can be as simple as a SELECT *
from your existing table!
In the console, you’ll be able to map your existing fields into Statsig fields
Once that’s done you can preview the data we’ll pull, set an ingestion schedule, and optionally load some recent historical data to get started.
We’ll do the work to make sure your data is synced and reflects your source of truth. Some of this work includes:
Running a scheduled pull and processing your data on your chosen schedule
Re-syncing data everywhere in the console when we notice a change from what we previously loaded
Providing notifications and alerts if there are issues with your ingestion so you can quickly address any problems with the connection setup
(Fast follow) supporting self-service backfills, so you can fix broken source data or retroactively add metrics to your experiment results
Here at Statsig, we are on a mission to empower your experimentation culture by making data more accessible.
We’re really excited about this new phase in how you can use Statsig. There’s always more work to do, and we are always happy to hear — and act on — your feedback to help you grow with Statsig.
The docs are here. Give it a try on Statsig, and let us know what you think!
Find out how we scaled our data platform to handle hundreds of petabytes of data per day, and our specific solutions to the obstacles we've faced while scaling. Read More ⇾
Building a scalable experimentation platform means balancing cost, performance, and flexibility. Here’s how we designed an elastic, efficient, and powerful system. Read More ⇾
The debate between Bayesian and frequentist statistics sounds like a fundamental clash, but it's more about how we talk about uncertainty than the actual decisions we make. Read More ⇾
Here's how we optimized store cloning, cut processing time from 500ms to 2ms, and engineered FastCloneMap for blazing-fast entity updates. Read More ⇾
It's one thing to have a really great and functional product. It's another thing to have a product that feels good to use. Read More ⇾
Stratified sampling enhances A/B tests by reducing variance and improving group balance for more reliable results. Read More ⇾