About the guest author. Alexey Komissarouk is a Growth Engineering Advisor who teaches Growth Engineering on Reforge. Previously, he was the Head of Growth Engineering at MasterClass.
On March 7, 2024, the agentic assistant Manus.im streaked across X and Hacker News. The virality generated massive early interest, and the founders were overjoyed, until they realized their GPU cluster was crumpling under the crush of sign‑ups. The company froze invites, and the buzz around Manim.im evaporated as fast as it came.
Less than 90 days earlier, ChatGPT had faced an even bigger wave—hitting 100 million monthly users in barely two months—yet held response time under a second thanks to months of staged load tests and pre‑purchased GPU blocks.
This contrast captures the core of AI‑era growth: Some startups ride the wave to hyper-growth, while others sink under the swell. Not since the days of Friendster has a winning product lost due to an inability to support growing traffic. When a company’s ability to scale is tied to LLM compute, growth best practices taken from conventional SaaS need to be re-imagined. Over the last few years, our work has brought us into close contact with fast-growing AI companies. Below is the synthesis of our observations thus far: four growth‑critical challenges unique to AI products and how to navigate them.
When Toronto startup Wombo released its lip‑sync app in February 2021, it “took off like a rocket ship,” logging 25 million downloads in the first month and 74 million within ten months. The spike pushed its AWS bill “an order of magnitude higher than all the money [it] had raised,” threatening bankruptcy until AWS stepped in with credits and architectural help.
You only get one shot. If you can’t absorb quickly-growing demand, you forfeit network effects, fresh data to fine-tune your models, and priceless social proof.
Don’t expect to be bailed out by Amazon the minute you’re successful; instead, invest in the high-ROI scale prep. Specifically:
Pre‑negotiate burst capacity with vendors (spot GPUs, secondary clouds, reseller credits) so you scale up within minutes without surprise bills.
What happens if you get more traffic than you can handle? Think ahead and ship a service degradation plan: shorten context windows, compress outputs, or enter queue mode rather than just erroring out.
Once you’ve got a working product, consider running quarterly “black‑friday” drills—replay 10× expected traffic in a load test environment to identify weak spots for scale, including auth and billing.
In March 2023, Midjourney halted its wildly popular free tier, citing “extraordinary demand and trial abuse,” and even pushed paying users into wait queues. The abrupt pause triggered Discord uproar and a flood of cancellations, forcing the company to reintroduce throttled access weeks later.
In traditional SAAS, compute costs become increasingly insignificant as usage grows. With LLMs, however, inference costs scale directly with engagement, meaning your own growth can nuke gross margin. Surprise price hikes or throttles fracture trust and mute referral loops just when you need them most.
Consider whether your product’s usage could be metered in tokens or GPU‑seconds vs calendar months. If flat-fee monthly billing is still the best choice, consider explicit monthly or daily rate limits on lower-end paid plans.
Offer a “happy hour” off‑peak tier that absorbs hobby traffic without harming premium latency.
Wire in a unit‑economics circuit‑breaker: if gross margin per user dips below a guardrail, automatically turn off paid marketing and gate new sign‑ups until you scale supply or lift prices.
When Notion AI opened its waitlist in November 2022, the team expected 200k sign‑ups. Five weeks later, it hit one million. Early users, however, complained they “still didn’t know what to do with this thing.” Notion pivoted from “AI writes for you” to “AI improves what you’ve already written,” redesigned starter prompts, and saw usage rise as the mental cost of exploration dropped.
Your most valuable users aren’t prompt engineers. If creating value feels like homework—endless trial‑and‑error instructions—activation stalls, and word‑of‑mouth dies.
Replace blank prompt boxes with role‑based starter chips and one‑click examples that autofill.
Instrument “prompt redo” and “undo” events as frustration signals; run weekly funnel reviews on them.
Offer an “explain my last prompt” toggle—turning trial‑and‑error into guided learning.
Perplexity’s first token-metered API looked rational but backfired, causing sticker shock for customers and resulting in an eventual rollout of three “search‑mode” tiers in 2025.
Similarly, Notion AI’s initial pay-as-you-go pricing led beta customers to hoard credits instead of experimenting, kneecapping engagement. Weeks before GA, the team ripped out usage‑based billing in favor of a simple flat add‑on, trading perfect unit economics for faster habit formation.
AI usage is spiky and unfamiliar; users can’t predict how many tokens they’ll burn. If pricing feels like surge‑billing, exploration stops, and upgrade rates flat‑line.
When financially viable, start with a flat plan that eliminates mental transaction costs; layer in usage-based overages only once users reach actual ceilings.
Expose “token burn so far” inside the interface, not tucked away in billing dashboards.
Run price‑sensitivity experiments on engaged cohorts, not top‑of‑funnel sign‑ups; willingness to pay lags perceived mastery.
Ultimately, far more start-ups die from a lack of attention than from excess. Don’t take this advice as permission to “get everything ready for web-scale before you launch” - AI or not, a start-up’s top priority must remain building something people want.
Nonetheless, with strategic investments in pricing, usability, and planning, an AI startup can prepare itself to ride the attention wave and avoid a wipeout.