Products

Solutions

Resources

Docs Pricing

Products

Solutions

Resources

Products

Solutions

Resources

Pricing

Jina Yoon

Technical Content Writer, Statsig

ENGINEERING

What no one tells you about feature flags and messy code

Fri Mar 21 2025

Feature flags aren't ruining your codebase. Lack of cleanup accountability is.

Feature flags are the secret sauce behind the rapid releases of major tech companies like Amazon, Meta, OpenAI, Notion, and many others. But they can also bog down your codebase in hundreds of if/else blocks if you're not careful. This post covers the part that most “feature flag best practices” guides avoid talking about: how to think about cleanup and deprecation.

If you’ve ever tried coding with feature flags, there’s a good chance this thought has crossed your mind at least once:

“I’m really just putting a bunch of if statements here and there. Is there a better way to do this? Am I doing this right? Are these basically just OP global variables? This feels messy.”

And you’re not wrong. Like most tools, feature flags don’t cause tech debt on their own: it’s how you use them that determines whether they speed you up or weigh you down.

TL;DR—Feature flags are not the problem, but they can lead to messy code if you don’t use them responsibly. Here’s how to avoid that.

What usually happens

You’re on a software team. There’s a new feature or infrastructure change, and you want to ship it safely. So, you put it behind a feature flag. You test it, ramp it up, and eventually release it to 100% of users.

Everyone’s happy. Job done, right?

Well… kind of. That old flag is still sitting in your codebase, hidden in an if block that no longer serves a purpose. The next day, you’re already assigned a new task. No one’s asking about the flag anymore. You still think about it sometimes—maybe while brushing your teeth—but it’s no one’s top priority. And so it stays.

Why it's a problem

As you add more and more feature flags, the actual path that your code runs can expand into a million different directions, much like the Marvel universe's sacred timeline exploding into itself.

These tiny incremental forks can really add up and introduce new bugs, dev workflows, and overall engineering QoL. How painful is it really, you ask? Let's walk through an example.

Say your product team runs a promo that changes how discounts are calculated. The current logic looks like this:

    
if (statsig.checkGate("ff_new_checkout")) {
  // New discount logic with dynamic pricing
  showNewDiscountFlow(user); 
} else {
  // Old coupon-based logic
  applyCouponIfEligible(user.cart);
}

Now imagine you're asked to tweak the discount rules. Do you update just the new logic? Do you also update the old logic? At this point, unless you have a central place to search for flag definitions and statuses, you'll have to dig through both functions to decide. Each of those might also have their own convoluted flag logic within them, and the timelines continue to split in all directions.

Why this happens

It's easy to leave flags behind because there's rarely a forcing function to clean them up. They don’t break anything. Your product still works. But over time, the accumulation of stale flags makes your code harder to read, reason about, and refactor. Especially in sensitive or high-traffic areas of your code, this clutter adds up.

So what can you do?

Best practices for feature flag cleanup

1. Create a cleanup ticket alongside your rollout

When you introduce a feature flag (say, ff_new_checkout) one approach is to ship it with a follow-up task or ticket to remove it. The best time to do this is when the context is fresh: you know why the flag exists, what it's gating, and when it can likely be removed.

For example, if the flag is being used to slowly roll out a new checkout experience, and you're aiming for 100% rollout by the end of the month, create a “Remove ff_new_checkout” ticket with a due date 30–45 days after full rollout. You can include the location of the flag in the codebase, a link to the original rollout PR, and any conditions that need to be met before removal (e.g. “Only remove once adoption is at 100% and no rollbacks have occurred in 14 days”).

For example, the PR for the new feature might include something like this:

Feature Rollout: New Checkout Flow

This PR introduces the `ff_new_checkout` flag to gate a new checkout experience.
✅ Created a follow-up cleanup ticket: [ENG-1234](https://jira.example.com/browse/ENG-1234)
🎯 Target cleanup date: ~30 days post full rollout
📌 Will remove old `renderLegacyCheckout()` path once usage hits 100% with no issues for 2 weeks

And you might add a new ticket with a template like this for the flag cleanup:

Ticket: Remove `ff_new_checkout` feature flag

Context: This flag was added in [PR #456](https://github.com/org/repo/pull/456) to safely roll out the new checkout flow. The rollout reached 100% on March 12, 2025.
Tasks:
- [ ] Remove all `if/else` conditions using `ff_new_checkout`
- [ ] Delete the flag from Statsig’s dashboard (mark as deprecated first)
- [ ] Remove related experiment code or tracking if applicable
- [ ] Update documentation or `FLAGS.md` if needed
Blocked by:
- [ ] Confirmation that no users are on the legacy flow
- [ ] No recent rollbacks in the past 14 days

This workflow solves one of the most common problems with feature flags: they disappear from your team's mental backlog almost immediately after launch. If you wait until someone remembers that a flag exists—or worse, encounters it while trying to debug unrelated behavior—it's already adding to your tech debt.

It also helps assign clear ownership and accountability. If your team uses Jira, Asana, or any ticketing tool, you can assign the removal task to the same person or team who owned the rollout. That way, cleanup is part of the same dev cycle.

Of course, there are real-world challenges: cleanup tasks are often seen as lower priority than new work, and tickets created “for later” sometimes languish in the backlog. That’s why it helps to make the cleanup ticket visible and linked—either in the rollout PR description, in your code comments, or in a shared feature tracking doc. Some teams even use label systems like flag:introduced and flag:remove-by to track this systematically.

Platforms like Statsig take this even further by tying flags to the users who created them, auto-generating staleness reminders, and surfacing inactive flags in dashboards—making it much harder for these cleanup steps to fall through the cracks.

2. Have a shared framework for what deserves cleanup

That said, adding these actions and checks also takes time. In practice, not every flag needs to be aggressively retired. Sometimes it’s okay to leave a flag in place forever, especially if it gates a critical kill switch or emergency path. But for most feature rollouts, you should probably know ahead of time:

When should this flag be removed?
Who’s responsible for removing it?
What happens if we forget?

Depending on your company's culture and priorities, the answers could totally be that the flag will never be removed, and that no one is responsible for removing it. The point is that this is a choice; if you aren't happy with it, maybe there's a greater organizational or leadership question that needs to be asked here... 🤔

Generally speaking, though, teams might dedicate the extra time or resources for flag cleanup around highly sensitive or critical features such as those related to user privacy or security. Other smaller nits and fixes—maybe not. It really depends on your culture.

Ultimately, the goal behind asking these questions isn’t to make your codebase perfectly clean. It’s to be intentional about what deserves cleanup time.

3. Set automated reminders for stale flags

Manual reminders (like setting a calendar event or Jira due date) work… until they don’t. You could improve this by adding automation:

Fail CI if a flag is older than X months
Run a Slack bot that posts stale flag warnings weekly
Send reminder emails based on flag creation dates

However, this only works if you track metadata like who created the flag, when it was created, and when it should be removed. With Statsig, this tracking is built in. When you create a flag, it auto-assigns an owner (based on the creator), and sends reminder emails when the flag goes stale, all without you needing to wire anything up.

Another point to consider is whether you might need accountability measures for this. Email reminders help people know that there is more work to be done, but they cannot guarantee if someone will actually put in the time to do the work. That part's still up to your organizational culture and priorities.

4. Maintain a living flag dashboard

If you're not using a feature management platform, a simple FLAGS.md file in the root of your repo can go a long way. This acts like a package.json for your flags, like a single source of truth for what flags exist, who owns them, and when they should be removed. At a minimum, include columns like:

Flag name – the exact string used in code (ff_new_checkout)
Created on – when the flag was introduced (helps track staleness)
Deprecate by – a target date for cleanup
Owner – the responsible engineer or team
Status – one of active, to-remove, permanent, or deprecated

You can keep this as a markdown table, spreadsheet, or YAML config—whatever makes sense for your team. The key is to make it visible and reviewable during code reviews, retros, or monthly audits. It introduces a lightweight process for flag hygiene without needing full tooling, and it makes stale flags easier to catch before they accumulate.

That said, this is a very manual approach. Someone will still have to update this file, remember to check it, and enforce any lifecycle rules. This works fine for small teams or a handful of flags, but once you’re managing dozens across services, things will slip.

That’s where platforms like Statsig remove the burden. Instead of maintaining metadata manually, flags are automatically tracked in a centralized UI that teammates across functions and teams can easily use. You get built-in visibility into usage, stale flags are highlighted, and you can filter by owner or cleanup status—no extra process required.

Final thoughts

Feature flags are an incredibly powerful tool but, as the cliché goes, with great power always comes great responsibility. Implementing them with the end in mind is important, or else you're just trading short-term speed for long-term mess.

You don’t need a perfect system. But you do need a system—one that includes ownership, visibility, and a clear lifecycle for flags. Whether you do that with internal tooling or a platform like Statsig, treating flags like part of your infrastructure (not just your code) will save you a lot of headaches down the line.

Looking for a smarter way to ship?

Statsig combines enterprise-grade feature flags with your product metrics, helping you ship fast, without breaking things

Book a demo now

Permalink: https://www.statsig.com/blog/feature-flag-code-cleanup

Products

Solutions

Resources

Products

Solutions

Resources

Docs

Pricing

Back to Blog home

Jina Yoon

What no one tells you about feature flags and messy code

Feature flags aren't ruining your codebase. Lack of cleanup accountability is.

What usually happens

Why it's a problem

Why this happens

Best practices for feature flag cleanup

1. Create a cleanup ticket alongside your rollout

Feature Rollout: New Checkout Flow

Ticket: Remove `ff_new_checkout` feature flag

2. Have a shared framework for what deserves cleanup

3. Set automated reminders for stale flags

4. Maintain a living flag dashboard

Final thoughts

Looking for a smarter way to ship?

Recent Posts

Sink, swim, or scale: What startups teach us about launching AI

Alexey Komissarouk, Yuzheng Sun, PhD

Optimizing cloud compute costs with GKE and compute classes

Pablo Beltran

How Statsig lets you ship, measure, and optimize AI-generated code

Sid Kumar, Brock Lumbard

Your users are your best benchmark: a guide to testing and optimizing AI products

Skye Scofield

The more the merrier? The problem of multiple comparisons in A/B Testing

Allon Korem, Oryah Lancry-Dayan

Randomization: The ABC’s of A/B Testing

Allon Korem, Oryah Lancry-Dayan