The Commercial Feature Flag Problem
Feature flags are essential. I've written about why they transform deployment confidence — the short version: they decouple code deployment from feature activation, enabling gradual rollouts, instant kill switches, and A/B experiments without deployment risk.
The question is implementation: commercial SaaS (LaunchDarkly, Split.io), or in-house?
The answer for most teams is neither. It's a hybrid.
The Two-Tier Architecture
Frontend flags and backend flags have fundamentally different requirements:
Frontend flags power A/B experiments. They need statistical significance calculations, experiment analytics, and user segmentation. Commercial tools do this well.
Backend flags control which features are active at the system level. They need to be fast (evaluated on every authenticated request), reliable (an external outage should not affect production), and auditable. These requirements favour in-house implementation.
| Tier | Tool | Use Cases |
| --- | --- | --- |
| Frontend | Statsig | A/B experiments, UI variants, conversion optimisation |
| Backend | Custom DB flags | Country-level features, user-specific overrides, critical toggles |
Why Statsig Instead of LaunchDarkly
LaunchDarkly is excellent. But per-seat pricing becomes expensive as teams scale. For most teams below ~100 engineers, Statsig offers comparable A/B experimentation with built-in statistical significance, a generous free tier, and better pricing at scale.
The key question: do you need LaunchDarkly's advanced targeting (user attribute-based segmentation, complex rule chains)? If the answer is "mainly percentage rollouts and A/B experiments", a lighter-weight alternative likely suffices.
Why Custom DB Flags for the Backend
The most important reason: no external dependency for critical paths.
A feature flag that controls whether subscription renewals use the old or new payment flow cannot have a SaaS dependency. If the flag service has a 5-minute outage, what does your subscription engine do?
With a custom DB flag:
- ·Evaluated with a simple database query (cached in Redis)
- ·Works even if every external service is down
- ·Full audit trail in your own database
- ·No per-evaluation costs
The Priority System
For a multi-market business (UK and Germany in our case):
typescriptasync function isFeatureEnabled(
featureName: string,
userId?: string,
countryCode?: string
): Promise<boolean> {
// 1. User-specific (highest priority — internal testing, staged rollouts)
if (userId) {
const userFlag = await cache.get(`flag:${featureName}:user:${userId}`);
if (userFlag !== null) return userFlag;
}
// 2. Country-level (UK vs DE feature differences)
if (countryCode) {
const countryFlag = await cache.get(`flag:${featureName}:country:${countryCode}`);
if (countryFlag !== null) return countryFlag;
}
// 3. Global default
return await cache.get(`flag:${featureName}`) ?? false;
}Cached in Redis with a 60-second TTL. Negligible latency. Instant kill-switch capability.
This lets you:
- ·Enable a new checkout flow for internal team accounts before any customers see it
- ·Roll out a new payment method to German customers only
- ·Globally toggle a feature in seconds
When to Pay for Commercial Tooling
The hybrid approach has limits. If you need:
- ·Complex user attribute-based targeting
- ·Real-time flag evaluation without database dependency
- ·Non-engineer self-service flag configuration with a proper UI
- ·SDKs for multiple platforms (iOS, Android, web, backend)
...then a commercial tool earns its cost. The calculation changes when your A/B programme is sophisticated enough that the statistical tooling genuinely saves engineering time.
For most teams below ~100 engineers, the hybrid covers 95% of use cases. Start there. Migrate if you genuinely outgrow it.