Date of slack thread: 7/30/24
Anonymous: We’re running an experiment to redesign our homepage and see the effects on a few key metrics. This would be a Stable ID experiment. The issue is that both new users and existing users see the homepage. We only want to consider new users from the start of the experiment for the metric results. But we only know after the fact if a user is an existing user (once they sign in). Is there some solution that allows us to exclude these existing users from the experiment metrics when the information about whether they are an existing user is unknowable when the exposure is generated?
Lin Jia (Statsig): Hi <@U04NR2KD83Z>, the likely solution here is to use https://docs.statsig.com/experiments-plus/overrides|override for this experiment to exclude the set of users.
Lin Jia (Statsig): Note that if you add overrides after a user has already been exposed in an experiment, that user will not be excluded from Pulse results. That user’s events and metrics will continue to be attributed to the group they were first assigned to. However, the override will be applied and honored once you define it. This can cause your experiment results to be diluted or polluted since the user’s actions may be attributed to one group while they were actually exposed to a different group. The overall impact from this issue will depend on how many users fall into this category.
Makris (Statsig): I don’t think Overrides would work given Mark doesn’t know which users to exclude until after they are logged in, which happens after assignment.
Makris (Statsig): The Statbot suggestion above is for WHN only. Mark you’re on cloud correct?
Anonymous: Gotcha, yes I’m on Cloud.
Makris (Statsig): Hey Mark, so what you want is a Segment of known users. If you set up a Segment of these known existing users, you can then exclude these known users from your upcoming experiment using Conditional Overrides.
Anonymous: Thanks Makris. To populate a Segment, it looks like we use a Conditional or an ID list. But for Conditional, we don’t have a signal for whether a user is an existing user before their first exposure, and likewise, we wouldn’t have knowledge of whether the user has a User ID before the first exposure. Might I be missing something? (Never worked with Segments before)
Makris (Statsig): Segments are independent of this experiment. They are a company-wide concept. You’d use some existing event, e.g. account-login (I’m assuming you have one) to create a Segment. You’d define the segment to be the set of StableID’s that have ever fired an account-login event before.
Makris (Statsig): I see you created a segment a few minutes ago to test out. I’m looking at this now too.
Makris (Statsig): I don’t actually see the logic I thought was there. Taking a closer look.
Makris (Statsig): I just spoke to another member of the team and currently there’s no way to do exactly your desired goals in cloud.
Makris (Statsig): The only way it would be doable would be with a fixed overrides/segment that you’d have to manually generate from a list of StableIds. However we only support up to about 1000 of these StableIDs in an override. I’m guessing this is far too low a limit for your number of existing users.
Makris (Statsig): My suggest would be to go ahead with your experiment and include both new and existing users in your experiment.
Makris (Statsig): Absolute metric values for control/treatment groups could be watered down (e.g. account-creation); however, fortunately any relative changes will remain unchanged so you can still trust any reported change and CI as accurate.
Anonymous: <@U0727PC0VM0> I really appreciate you going the extra mile to look into this question. Gotcha, makes sense. Thank you for your help!
Timothy Chan (Statsig): Hi Mark, there is a way but it’s a bit buried. You can upload a list of user_ids into a segment, and use this list to filter out (exclude) or filter in (include) users during a custom query. You can find this setting in Pulse here:
Timothy Chan (Statsig): It’s a bit indirect. We are planning on making the ability to upload user attributes for analysis more accessible in the next 3 months.