Feature Flag System
Proxy-based gating keeps flag checks invisible to callers. Decorator-layered evaluation stacks kill-switches and overrides without combinatorial explosion. Swappable targeting strategies let you go from percentage rollouts to user-segment rules at config time.
Key Abstractions
Data class holding a flag's name, enabled state, and targeting rules that strategies evaluate against
Strategy interface with evaluate(context) -> bool. Implementations include PercentageEvaluator, UserSegmentEvaluator, etc.
Wraps a base FlagEvaluator and adds a layer such as kill-switch or manual override before delegating
Singleton that serves as the central registry for all feature flags, ensuring consistent state across the application
Proxy that wraps a service call and transparently checks a flag before routing to the new or old implementation
Observer that gets notified whenever a flag's state changes, enabling cache invalidation, logging, or metric emission
Class Diagram
How It Works
A feature flag system lets you separate deployment from release. Code ships to production behind a flag that is off. When you're ready, you flip the flag: no redeploy, no merge, no prayer. If something breaks, flip it back. The system needs to evaluate flags fast, layer multiple targeting rules cleanly, and notify interested parties when flags change.
Here's the core evaluation flow: a request arrives, the proxy intercepts it, asks the FlagManager for the flag, and hands the flag plus user context to an evaluator chain. The evaluator chain is a stack of decorators wrapping a base strategy. The outermost decorator might be an override check (for manual interventions). Below that, a kill-switch decorator catches emergency shutoffs. At the bottom, the base strategy runs the actual targeting logic: hash the user into a percentage bucket, check their segment, or whatever the flag is configured for. If any layer says no, evaluation short-circuits. If everything says yes, the new code path runs.
The proxy pattern keeps all of this invisible to application code. The caller asks for a Service and gets a FeatureProxy that implements the same interface. The proxy handles all flag evaluation internally. When the flag is eventually removed, you swap the proxy for the real service and delete the flag. No scattered if/else blocks to hunt down.
Requirements
Functional
- Register and manage feature flags with names, enabled states, and targeting rules
- Evaluate flags against user context using swappable strategies (percentage rollout, user segment, etc.)
- Layer evaluation logic: kill-switches and manual overrides should take precedence over base targeting
- Gate service calls behind flags transparently via proxy, routing to new or old implementations
- Notify observers when flag state changes for audit logging and metrics
Non-Functional
- Flag evaluation must default to "off" on any error: missing flags, evaluator exceptions, malformed context
- Consistent flag state across the application via a single FlagManager instance
- Evaluation should be fast: no network calls in the hot path, no blocking I/O
- Support adding new targeting strategies without modifying existing evaluation code
Design Decisions
Couldn't you just use if/else for flag checks?
Scattering if (flagEnabled("new_checkout")) across your codebase creates two problems. First, you have to find every check when the flag is cleaned up. Miss one and you have dead code. Miss the wrong one and you have a bug. Second, the flag check is coupled to business logic. The checkout handler now knows about feature flags, which is not its job.
A proxy wraps the entire service behind the same interface. Callers never know the flag exists. When the flag is removed, you replace the proxy with the winning implementation. One change in the wiring layer, zero changes in business code. Especially valuable when the same flag gates multiple code paths: the proxy handles all of them in one place.
What's wrong with a flat rule list instead of Decorator?
You could evaluate all rules in a flat list: iterate through kill-switches, overrides, percentage checks, and segment checks in a loop. But ordering becomes rigid. What if kill-switches should always run first? What if overrides should bypass everything? Ordering logic in a flat list requires index management and convention.
Decorators make the ordering explicit and structural. The outermost decorator runs first. Each decorator either short-circuits or delegates to the next layer. You can compose different evaluation stacks for different flags. A critical infrastructure flag might have kill-switch and override only. A UI experiment gets the full chain with percentage targeting at the base. The composition is visible in the construction, not hidden in loop ordering.
Singleton concerns for FlagManager
A singleton is a code smell in most contexts, but FlagManager is one of the legitimate cases. If two parts of your application create separate FlagManager instances, they get separate flag registries. Flag A is enabled in one and disabled in the other. Not a theoretical risk: it's a guaranteed bug in any dependency injection setup that accidentally creates two instances.
The tradeoff is testability. You can't easily swap in a mock FlagManager. The reset() method is a pragmatic escape hatch for testing. In production, the singleton guarantee is worth the test ergonomics cost. Alternatively, you can make it a "singleton by convention" managed by your DI container rather than enforced by the class itself.
How does gradual rollout stay deterministic?
Percentage-based rollout uses a hash of the flag name and user ID, not a random number. If you use random() < 0.5, a user might see the new checkout on one request and the old checkout on the next. Hashing gives deterministic bucketing: the same user always lands in the same bucket for the same flag. Increasing the percentage from 10% to 20% adds new users without moving existing ones. This property is called monotonic rollout and it's what makes gradual rollouts safe.
Interview Follow-ups
- "How would you integrate A/B testing?" Add a variant system where a flag can have multiple values (not just on/off). The evaluator returns a variant key instead of a boolean. Each variant maps to a different implementation. Log the variant assignment for each user so the analytics pipeline can compute conversion rates per variant. The proxy routes to the matching implementation.
- "How would you sync flags across a distributed fleet?" Use a central flag store (Redis, DynamoDB, or a dedicated service like LaunchDarkly). Each application instance keeps a local cache with a short TTL. For faster propagation, use a pub/sub channel (Redis Pub/Sub, Kafka) to push flag changes. The FlagChangeListener observer pattern maps directly to this: listeners subscribe to the push channel and update the local cache.
- "How do you handle flag cleanup?" Flag cleanup is a discipline problem, not a code problem. Set an expiration date when creating the flag. Run a weekly report of flags past their expiration. Each flag should have an owner. The proxy pattern makes removal easy: replace the proxy with the winning service. Automated tests should assert that removed flags are not referenced anywhere in the codebase.
- "What is dark launching and how do flags enable it?" Dark launching means running new code in production without exposing results to users. Enable the flag for 100% of traffic but discard the new code path's response and return the old one. Compare the new output against the old output offline to validate correctness. A specific proxy configuration handles this: call both services, return the old response, log the diff.
Code Implementation
1 from __future__ import annotations
2 from abc import ABC, abstractmethod
3 from dataclasses import dataclass, field
4 import hashlib
5
6
7 # --------------- Feature Flag Data Class ---------------
8
9 @dataclass
10 class FeatureFlag:
11 """Core data class representing a single feature flag."""
12 name: str
13 enabled: bool = False
14 targeting_rules: dict = field(default_factory=dict)
15
16
17 # --------------- Strategy: Flag Evaluators ---------------
18
19 class FlagEvaluator(ABC):
20 @abstractmethod
21 def evaluate(self, flag: FeatureFlag, context: dict) -> bool:
22 ...
23
24
25 class PercentageEvaluator(FlagEvaluator):
26 """Hashes user_id into a 0-99 bucket; allows if bucket < percentage."""
27
28 def __init__(self, percentage: int):
29 self._percentage = max(0, min(100, percentage))
30
31 def evaluate(self, flag: FeatureFlag, context: dict) -> bool:
32 if not flag.enabled:
33 return False
34 user_id = context.get("user_id", "")
35 hash_val = int(hashlib.md5(
36 f"{flag.name}:{user_id}".encode()
37 ).hexdigest(), 16)
38 bucket = hash_val % 100
39 return bucket < self._percentage
40
41
42 class UserSegmentEvaluator(FlagEvaluator):
43 """Allows if the user's segment is in the allowed set."""
44
45 def __init__(self, allowed_segments: set[str]):
46 self._allowed = allowed_segments
47
48 def evaluate(self, flag: FeatureFlag, context: dict) -> bool:
49 if not flag.enabled:
50 return False
51 user_segment = context.get("segment", "")
52 return user_segment in self._allowed
53
54
55 # --------------- Decorator: Layered Evaluation ---------------
56
57 class EvaluationDecorator(FlagEvaluator):
58 """Base decorator wrapping another evaluator."""
59
60 def __init__(self, wrapped: FlagEvaluator):
61 self._wrapped = wrapped
62
63 def evaluate(self, flag: FeatureFlag, context: dict) -> bool:
64 return self._wrapped.evaluate(flag, context)
65
66
67 class KillSwitchDecorator(EvaluationDecorator):
68 """If the flag is in the killed set, return False immediately."""
69
70 def __init__(self, wrapped: FlagEvaluator, killed_flags: set[str]):
71 super().__init__(wrapped)
72 self._killed = killed_flags
73
74 def evaluate(self, flag: FeatureFlag, context: dict) -> bool:
75 if flag.name in self._killed:
76 return False
77 return super().evaluate(flag, context)
78
79
80 class OverrideDecorator(EvaluationDecorator):
81 """Manual per-flag overrides bypass all downstream evaluation."""
82
83 def __init__(self, wrapped: FlagEvaluator,
84 overrides: dict[str, bool]):
85 super().__init__(wrapped)
86 self._overrides = overrides
87
88 def evaluate(self, flag: FeatureFlag, context: dict) -> bool:
89 if flag.name in self._overrides:
90 return self._overrides[flag.name]
91 return super().evaluate(flag, context)
92
93
94 # --------------- Observer: Flag Change Listener ---------------
95
96 class FlagChangeListener(ABC):
97 @abstractmethod
98 def on_flag_changed(self, name: str, old_val: bool,
99 new_val: bool) -> None:
100 ...
101
102
103 class AuditLogger(FlagChangeListener):
104 def __init__(self):
105 self.log: list[str] = []
106
107 def on_flag_changed(self, name: str, old_val: bool,
108 new_val: bool) -> None:
109 entry = f"[AUDIT] {name}: {old_val} -> {new_val}"
110 self.log.append(entry)
111 print(f" {entry}")
112
113
114 class MetricsEmitter(FlagChangeListener):
115 def on_flag_changed(self, name: str, old_val: bool,
116 new_val: bool) -> None:
117 print(f" [METRIC] flag_toggled(name={name}, "
118 f"new_state={new_val})")
119
120
121 # --------------- Singleton: Flag Manager ---------------
122
123 class FlagManager:
124 """Central registry for all feature flags (singleton)."""
125 _instance: "FlagManager | None" = None
126
127 def __init__(self):
128 self._flags: dict[str, FeatureFlag] = {}
129 self._listeners: list[FlagChangeListener] = []
130
131 @classmethod
132 def get_instance(cls) -> "FlagManager":
133 if cls._instance is None:
134 cls._instance = cls()
135 return cls._instance
136
137 @classmethod
138 def reset(cls) -> None:
139 cls._instance = None
140
141 def register_flag(self, flag: FeatureFlag) -> None:
142 self._flags[flag.name] = flag
143
144 def get_flag(self, name: str) -> FeatureFlag | None:
145 return self._flags.get(name)
146
147 def update_flag(self, name: str, enabled: bool) -> None:
148 flag = self._flags.get(name)
149 if flag is None:
150 return
151 old_val = flag.enabled
152 flag.enabled = enabled
153 if old_val != enabled:
154 for listener in self._listeners:
155 listener.on_flag_changed(name, old_val, enabled)
156
157 def add_listener(self, listener: FlagChangeListener) -> None:
158 self._listeners.append(listener)
159
160
161 # --------------- Proxy: Feature-Gated Service ---------------
162
163 class Service(ABC):
164 @abstractmethod
165 def execute(self, request: str) -> str:
166 ...
167
168
169 class NewCheckoutService(Service):
170 def execute(self, request: str) -> str:
171 return f"[NEW checkout] processed: {request}"
172
173
174 class OldCheckoutService(Service):
175 def execute(self, request: str) -> str:
176 return f"[OLD checkout] processed: {request}"
177
178
179 class FeatureProxy(Service):
180 """Proxy that checks a flag before routing to new or old impl."""
181
182 def __init__(self, flag_name: str, evaluator: FlagEvaluator,
183 new_service: Service, old_service: Service,
184 manager: FlagManager):
185 self._flag_name = flag_name
186 self._evaluator = evaluator
187 self._new = new_service
188 self._old = old_service
189 self._manager = manager
190
191 def execute(self, request: str, context: dict | None = None
192 ) -> str:
193 context = context or {}
194 flag = self._manager.get_flag(self._flag_name)
195 if flag is None:
196 return self._old.execute(request)
197 try:
198 if self._evaluator.evaluate(flag, context):
199 return self._new.execute(request)
200 except Exception:
201 pass # Default to off on errors
202 return self._old.execute(request)
203
204
205 # --------------- Demo ---------------
206
207 if __name__ == "__main__":
208 FlagManager.reset()
209 mgr = FlagManager.get_instance()
210
211 # Attach observers
212 audit = AuditLogger()
213 metrics = MetricsEmitter()
214 mgr.add_listener(audit)
215 mgr.add_listener(metrics)
216
217 # Register flags
218 mgr.register_flag(FeatureFlag("new_checkout", True,
219 {"rollout": 50}))
220 mgr.register_flag(FeatureFlag("dark_mode", True,
221 {"segments": ["beta"]}))
222 mgr.register_flag(FeatureFlag("experimental_search", True))
223
224 # 1. Strategy: percentage-based evaluation
225 print("=== Percentage Evaluator (50% rollout) ===")
226 pct_eval = PercentageEvaluator(percentage=50)
227 flag = mgr.get_flag("new_checkout")
228 for uid in ["alice", "bob", "carol", "dave", "eve",
229 "frank", "grace", "heidi"]:
230 ctx = {"user_id": uid}
231 result = pct_eval.evaluate(flag, ctx)
232 status = "ON" if result else "OFF"
233 print(f" user={uid:6s} -> {status}")
234
235 # 2. Strategy: segment-based evaluation
236 print("\n=== User Segment Evaluator ===")
237 seg_eval = UserSegmentEvaluator({"beta", "internal"})
238 dark = mgr.get_flag("dark_mode")
239 for seg in ["beta", "internal", "free", "enterprise"]:
240 ctx = {"segment": seg}
241 result = seg_eval.evaluate(dark, ctx)
242 status = "ON" if result else "OFF"
243 print(f" segment={seg:12s} -> {status}")
244
245 # 3. Decorator layering
246 print("\n=== Decorator Layering ===")
247 base = PercentageEvaluator(percentage=100) # 100% on
248 killed = KillSwitchDecorator(base,
249 killed_flags={"experimental_search"})
250 overridden = OverrideDecorator(killed,
251 overrides={"dark_mode": False})
252
253 exp_flag = mgr.get_flag("experimental_search")
254 result = overridden.evaluate(exp_flag, {"user_id": "alice"})
255 print(f" experimental_search (kill-switched): {result}")
256
257 result = overridden.evaluate(dark, {"user_id": "alice"})
258 print(f" dark_mode (override=False): {result}")
259
260 result = overridden.evaluate(flag, {"user_id": "alice"})
261 print(f" new_checkout (no override/kill): {result}")
262
263 # 4. Proxy: transparent gating
264 print("\n=== Feature Proxy ===")
265 proxy = FeatureProxy(
266 "new_checkout", pct_eval,
267 NewCheckoutService(), OldCheckoutService(), mgr
268 )
269 for uid in ["alice", "bob", "carol"]:
270 out = proxy.execute(f"order-{uid}", {"user_id": uid})
271 print(f" {uid}: {out}")
272
273 # 5. Observer: flag change notification
274 print("\n=== Observer Notification ===")
275 mgr.update_flag("new_checkout", False)
276 mgr.update_flag("dark_mode", False)
277
278 # Verify proxy defaults to old after flag disabled
279 print("\n=== Proxy After Flag Disabled ===")
280 out = proxy.execute("order-alice", {"user_id": "alice"})
281 print(f" alice: {out}")
282
283 print(f"\n=== Audit Log ({len(audit.log)} entries) ===")
284 for entry in audit.log:
285 print(f" {entry}")
286
287 print("\nAll operations completed successfully.")Common Mistakes
- ✗Not defaulting to 'off' on evaluation errors. If the evaluator throws or the flag is missing, the safe default is disabled. Failing open exposes unfinished features to everyone.
- ✗Serving stale flags from a local cache without a TTL or push-based invalidation. A flag turned off in the dashboard stays on for minutes in production.
- ✗Coupling flag evaluation logic directly into business code with scattered if/else checks. Flag removal becomes painful and leaves dead code everywhere.
- ✗Missing an audit trail for flag changes. When an incident happens and a flag was toggled, you need to know who changed it, when, and what the previous state was.
Key Points
- ✓Proxy pattern makes flag checks invisible to calling code. The caller invokes the same interface regardless of which implementation runs behind it.
- ✓Decorator pattern lets you stack evaluation layers: kill-switch, override, percentage. Each decorator adds one concern, with no monolithic if/else chain.
- ✓Singleton for FlagManager guarantees every part of the application reads from the same flag registry. Two registries means two sources of truth and subtle bugs.
- ✓Strategy pattern for targeting means you can swap from percentage-based rollout to user-segment targeting by changing the evaluator, not the flag system.