Platform ScaleProblem 28 of 29

Platform ScaleAdvanced

Stack Overflow

Observer PatternStrategy PatternState Pattern

Q&A platform with reputation system, voting mechanics, and search ranking. Observer pattern for reputation events, Strategy pattern for search, and State pattern for question lifecycle. Reputation thresholds gate privileged actions like voting and closing.

Key Abstractions

Platform

Facade coordinating users, questions, answers, votes, tags, and search

Question

Holds title, body, tags, answers list, vote count, and open/closed state

Answer

Response to a question with its own vote count and accepted/rejected status

User

Author with reputation score that gates privileged actions like voting and closing

Tag

Categorization label attached to questions for filtering and trending

SearchStrategy

Strategy interface for ranking search results by relevance, votes, or recency

Class Diagram

The Key Insight

Stack Overflow looks like a simple CRUD app on the surface. Users post questions, others post answers, people vote. But the real design challenge is the reputation system that ties everything together. Reputation is not just a vanity number. It is an access control mechanism. You need 15 reputation to upvote, 50 to comment, 500 to close questions. These thresholds create a self-moderating community where new users earn privileges by contributing quality content.

The voting mechanics have second-order effects that most people miss. An upvote on your answer gives you +10 reputation. A downvote costs the author -2, but it also costs the voter -1. That asymmetry is intentional. It discourages frivolous downvoting while still allowing the community to signal low-quality content. Good contributions earn privileges, which lets users do more moderation, which improves content quality. It is a virtuous cycle built into the data model.

From a patterns perspective, three patterns carry the weight. Observer propagates reputation changes when votes happen. Strategy makes search ranking pluggable (sort by relevance, votes, or date). State manages the question lifecycle: open questions accept answers, closed questions reject them. No boolean flags, no scattered if-checks.

Requirements

Functional

Users register and build reputation through contributions
Post questions with title, body, and tags
Post answers to open questions
Upvote and downvote questions and answers, with reputation effects on both voter and author
Accept an answer (only the question author can do this)
Close questions (requires sufficient reputation threshold)
Search questions with pluggable ranking strategies (relevance, votes, date)

Non-Functional

Reputation thresholds enforce privilege escalation for voting, commenting, and closing
Prevent double-voting by the same user on the same content
Closed questions must reject new answers at the state level, not through if-checks scattered everywhere
Search ranking must be swappable without modifying the search infrastructure

Design Decisions

Why State pattern for question lifecycle instead of a status enum?

A status enum like OPEN/CLOSED means every method that cares about state needs if (status == OPEN) ... else ... checks. The Question class accumulates these checks across addAnswer, close, reopen, and protect operations. State pattern moves the behavior into state objects. OpenState.addAnswer() works normally. ClosedState.addAnswer() throws an exception. The Question delegates to its current state without knowing which state it is in. Adding a "protected" state where only high-rep users can answer means creating one new class, not editing five methods.

Why store reputation as a running total instead of computing it?

You could recompute reputation from the full vote history every time you need it. But reputation checks happen constantly: every vote attempt, every comment attempt, every close vote. Scanning all votes for a user on every action is O(n) in the number of votes they have ever received. A running total updated on each vote is O(1). The tradeoff is keeping the total consistent with the underlying events, but that is straightforward when reputation updates happen in the same transaction as the vote.

Why the -1 penalty for downvoting?

Without a cost, downvoting is free. Serial downvoters can tank someone's reputation with zero risk to themselves. The -1 cost makes voters think twice. Is this answer bad enough that I am willing to spend a reputation point to signal it? This small friction dramatically reduces abuse while preserving the ability to flag genuinely poor content.

Why Strategy for search instead of hardcoded sorting?

Users want different things at different times. Someone debugging a specific error wants relevance-based results. Someone exploring a topic wants the highest-voted answers. A moderator tracking recent activity wants date-sorted results. Hardcoding any single approach means the others require code changes. Strategy pattern lets you swap ranking logic at runtime. The search method filters candidates and delegates ranking to whatever strategy is currently active.

Interview Follow-ups

"How would you implement bounties?" Add a Bounty object that locks reputation from the poster upfront. When the bounty period ends, award the locked reputation to the best answer. If no answer qualifies, half the bounty returns to the poster and the other half is destroyed.
"How would you handle spam detection?" Introduce a moderation pipeline. Flag posts that match spam patterns like high link density, repeated content, or rapid posting from low-rep accounts. Use the Observer pattern to notify moderators when flags accumulate past a threshold.
"What about real-time updates when new answers arrive?" WebSocket connections per question page. When an answer is posted, the observer notifies connected clients. Send a lightweight payload with the answer preview so the UI can show "1 new answer posted" without a full page reload.
"How would you scale the search?" Extract search into a dedicated service backed by Elasticsearch or similar. Index questions and answers as documents. The SearchStrategy interface stays the same at the API layer, but the implementation queries an external index instead of scanning in-memory data.

Code Implementation

  1  from abc import ABC, abstractmethod
  2  from dataclasses import dataclass, field
  3  from datetime import datetime
  4  from typing import Optional
  5  import uuid
  6  
  7  
  8  # --- Reputation thresholds ---
  9  VOTE_THRESHOLD = 15
 10  COMMENT_THRESHOLD = 50
 11  CLOSE_THRESHOLD = 500
 12  UPVOTE_REP = 10
 13  DOWNVOTE_REP = -2
 14  ANSWER_ACCEPTED_REP = 15
 15  ACCEPT_BONUS_REP = 2
 16  
 17  
 18  class User:
 19      """Tracks reputation and enforces privilege thresholds.
 20      Reputation starts at 1 and never drops below 1."""
 21  
 22      def __init__(self, user_id: str, username: str):
 23          self.user_id = user_id
 24          self.username = username
 25          self._reputation = 1
 26  
 27      @property
 28      def reputation(self) -> int:
 29          return self._reputation
 30  
 31      def add_reputation(self, points: int) -> None:
 32          self._reputation = max(1, self._reputation + points)
 33  
 34      def can_vote(self) -> bool:
 35          return self._reputation >= VOTE_THRESHOLD
 36  
 37      def can_comment(self) -> bool:
 38          return self._reputation >= COMMENT_THRESHOLD
 39  
 40      def can_close_question(self) -> bool:
 41          return self._reputation >= CLOSE_THRESHOLD
 42  
 43      def __repr__(self) -> str:
 44          return f"User({self.username}, rep={self._reputation})"
 45  
 46  
 47  # --- Question State (State Pattern) ---
 48  
 49  class QuestionState(ABC):
 50      @abstractmethod
 51      def add_answer(self, question: "Question", answer: "Answer") -> None: ...
 52  
 53      @abstractmethod
 54      def close(self, question: "Question") -> None: ...
 55  
 56      @abstractmethod
 57      def state_name(self) -> str: ...
 58  
 59  
 60  class OpenState(QuestionState):
 61      def add_answer(self, question: "Question", answer: "Answer") -> None:
 62          question.answers.append(answer)
 63  
 64      def close(self, question: "Question") -> None:
 65          question.state = ClosedState()
 66  
 67      def state_name(self) -> str:
 68          return "OPEN"
 69  
 70  
 71  class ClosedState(QuestionState):
 72      def add_answer(self, question: "Question", answer: "Answer") -> None:
 73          raise PermissionError("Cannot add answers to a closed question")
 74  
 75      def close(self, question: "Question") -> None:
 76          pass  # already closed
 77  
 78      def state_name(self) -> str:
 79          return "CLOSED"
 80  
 81  
 82  # --- Tag ---
 83  
 84  class Tag:
 85      def __init__(self, name: str):
 86          self.name = name
 87          self.question_count = 0
 88  
 89      def increment(self) -> None:
 90          self.question_count += 1
 91  
 92      def __repr__(self) -> str:
 93          return self.name
 94  
 95  
 96  # --- Answer ---
 97  
 98  class Answer:
 99      def __init__(self, answer_id: str, body: str, author: User):
100          self.answer_id = answer_id
101          self.body = body
102          self.author = author
103          self.votes = 0
104          self.accepted = False
105          self.created_at = datetime.now()
106  
107      def vote(self, is_upvote: bool) -> None:
108          self.votes += 1 if is_upvote else -1
109  
110      def accept(self) -> None:
111          self.accepted = True
112  
113      @property
114      def score(self) -> int:
115          return self.votes + (15 if self.accepted else 0)
116  
117      def __repr__(self) -> str:
118          status = " [ACCEPTED]" if self.accepted else ""
119          return f"Answer(by={self.author.username}, votes={self.votes}{status})"
120  
121  
122  # --- Question ---
123  
124  class Question:
125      def __init__(self, question_id: str, title: str, body: str,
126                   author: User, tags: list[Tag]):
127          self.question_id = question_id
128          self.title = title
129          self.body = body
130          self.author = author
131          self.tags = tags
132          self.answers: list[Answer] = []
133          self.votes = 0
134          self.state: QuestionState = OpenState()
135          self.created_at = datetime.now()
136  
137      def add_answer(self, answer: Answer) -> None:
138          self.state.add_answer(self, answer)
139  
140      def vote(self, is_upvote: bool) -> None:
141          self.votes += 1 if is_upvote else -1
142  
143      def close(self) -> None:
144          self.state.close(self)
145  
146      @property
147      def score(self) -> int:
148          return self.votes
149  
150      def __repr__(self) -> str:
151          tag_str = ", ".join(t.name for t in self.tags)
152          return (f"Question('{self.title}', by={self.author.username}, "
153                  f"votes={self.votes}, answers={len(self.answers)}, "
154                  f"state={self.state.state_name()}, tags=[{tag_str}])")
155  
156  
157  # --- Search Strategies ---
158  
159  class SearchStrategy(ABC):
160      @abstractmethod
161      def rank(self, questions: list[Question], query: str) -> list[Question]: ...
162  
163  
164  class RelevanceStrategy(SearchStrategy):
165      """Scores questions by keyword match frequency in title and body."""
166  
167      def rank(self, questions: list[Question], query: str) -> list[Question]:
168          terms = query.lower().split()
169  
170          def relevance(q: Question) -> int:
171              text = (q.title + " " + q.body).lower()
172              return sum(text.count(term) for term in terms)
173  
174          matching = [q for q in questions if relevance(q) > 0]
175          return sorted(matching, key=relevance, reverse=True)
176  
177  
178  class VoteStrategy(SearchStrategy):
179      """Returns matching questions sorted by vote count, highest first."""
180  
181      def rank(self, questions: list[Question], query: str) -> list[Question]:
182          terms = query.lower().split()
183          matching = [q for q in questions
184                      if any(t in (q.title + " " + q.body).lower() for t in terms)]
185          return sorted(matching, key=lambda q: q.votes, reverse=True)
186  
187  
188  class DateStrategy(SearchStrategy):
189      """Returns matching questions sorted by creation date, newest first."""
190  
191      def rank(self, questions: list[Question], query: str) -> list[Question]:
192          terms = query.lower().split()
193          matching = [q for q in questions
194                      if any(t in (q.title + " " + q.body).lower() for t in terms)]
195          return sorted(matching, key=lambda q: q.created_at, reverse=True)
196  
197  
198  # --- Platform Facade ---
199  
200  class Platform:
201      """Central coordinator. All user interactions go through here.
202      Manages reputation effects, vote deduplication, and search."""
203  
204      def __init__(self):
205          self._users: dict[str, User] = {}
206          self._questions: dict[str, Question] = {}
207          self._tags: dict[str, Tag] = {}
208          self._search_strategy: SearchStrategy = RelevanceStrategy()
209          self._vote_log: dict[str, set[str]] = {}
210  
211      def register_user(self, username: str) -> User:
212          user_id = str(uuid.uuid4())[:8]
213          user = User(user_id, username)
214          self._users[user_id] = user
215          return user
216  
217      def post_question(self, user_id: str, title: str, body: str,
218                        tag_names: list[str]) -> Question:
219          author = self._users[user_id]
220          tags = [self._get_or_create_tag(name) for name in tag_names]
221          for tag in tags:
222              tag.increment()
223  
224          q_id = str(uuid.uuid4())[:8]
225          question = Question(q_id, title, body, author, tags)
226          self._questions[q_id] = question
227          return question
228  
229      def post_answer(self, user_id: str, question_id: str, body: str) -> Answer:
230          author = self._users[user_id]
231          question = self._questions[question_id]
232          a_id = str(uuid.uuid4())[:8]
233          answer = Answer(a_id, body, author)
234          question.add_answer(answer)
235          return answer
236  
237      def vote(self, voter_id: str, question_or_answer, is_upvote: bool) -> None:
238          voter = self._users[voter_id]
239          if not voter.can_vote():
240              raise PermissionError(
241                  f"{voter.username} needs {VOTE_THRESHOLD} reputation to vote "
242                  f"(current: {voter.reputation})")
243  
244          # Prevent double voting
245          target_id = getattr(question_or_answer, 'question_id',
246                              getattr(question_or_answer, 'answer_id', None))
247          voter_targets = self._vote_log.setdefault(voter_id, set())
248          if target_id in voter_targets:
249              raise ValueError(f"{voter.username} already voted on this")
250          voter_targets.add(target_id)
251  
252          question_or_answer.vote(is_upvote)
253  
254          # Reputation effects
255          author = question_or_answer.author
256          if is_upvote:
257              author.add_reputation(UPVOTE_REP)
258          else:
259              author.add_reputation(DOWNVOTE_REP)
260              voter.add_reputation(-1)  # downvoting costs the voter too
261  
262      def accept_answer(self, user_id: str, answer: Answer, question: Question) -> None:
263          if question.author.user_id != user_id:
264              raise PermissionError("Only the question author can accept an answer")
265          answer.accept()
266          answer.author.add_reputation(ANSWER_ACCEPTED_REP)
267          question.author.add_reputation(ACCEPT_BONUS_REP)
268  
269      def close_question(self, user_id: str, question_id: str) -> None:
270          user = self._users[user_id]
271          if not user.can_close_question():
272              raise PermissionError(
273                  f"{user.username} needs {CLOSE_THRESHOLD} reputation to close questions")
274          self._questions[question_id].close()
275  
276      def set_search_strategy(self, strategy: SearchStrategy) -> None:
277          self._search_strategy = strategy
278  
279      def search(self, query: str) -> list[Question]:
280          all_questions = list(self._questions.values())
281          return self._search_strategy.rank(all_questions, query)
282  
283      def _get_or_create_tag(self, name: str) -> Tag:
284          if name not in self._tags:
285              self._tags[name] = Tag(name)
286          return self._tags[name]
287  
288  
289  if __name__ == "__main__":
290      platform = Platform()
291  
292      # Register users
293      alice = platform.register_user("alice")
294      bob = platform.register_user("bob")
295      charlie = platform.register_user("charlie")
296  
297      # Give alice and bob enough reputation to vote
298      alice.add_reputation(100)
299      bob.add_reputation(50)
300  
301      # Alice posts a question
302      q1 = platform.post_question(
303          alice.user_id,
304          "How does HashMap handle collisions in Java?",
305          "I know HashMap uses an array of buckets, but what happens when two keys "
306          "hash to the same bucket? Does it use chaining or open addressing?",
307          ["java", "hashmap", "data-structures"],
308      )
309      print(f"Posted: {q1}")
310  
311      # Bob posts another question
312      q2 = platform.post_question(
313          bob.user_id,
314          "When to use HashMap vs TreeMap?",
315          "Both implement the Map interface. HashMap gives O(1) lookups, TreeMap gives "
316          "sorted keys. When should I pick one over the other?",
317          ["java", "hashmap", "collections"],
318      )
319      print(f"Posted: {q2}")
320  
321      # Charlie answers Alice's question
322      a1 = platform.post_answer(
323          charlie.user_id,
324          q1.question_id,
325          "Java HashMap uses chaining with linked lists. Since Java 8, when a bucket "
326          "exceeds 8 entries, the linked list converts to a red-black tree for O(log n) lookups.",
327      )
328      print(f"\nCharlie answered: {a1}")
329  
330      # Bob also answers
331      a2 = platform.post_answer(
332          bob.user_id,
333          q1.question_id,
334          "Chaining. Each bucket is a linked list (or tree after threshold). "
335          "Open addressing is used by some other implementations like Python dicts.",
336      )
337      print(f"Bob answered: {a2}")
338  
339      # Voting
340      print("\n--- Voting ---")
341      platform.vote(alice.user_id, a1, True)
342      print(f"Alice upvoted Charlie's answer. Charlie rep: {charlie.reputation}")
343  
344      platform.vote(bob.user_id, a1, True)
345      print(f"Bob upvoted Charlie's answer. Charlie rep: {charlie.reputation}")
346  
347      platform.vote(alice.user_id, q2, True)
348      print(f"Alice upvoted Bob's question. Bob rep: {bob.reputation}")
349  
350      # Accept answer
351      print("\n--- Accept answer ---")
352      platform.accept_answer(alice.user_id, a1, q1)
353      print(f"Alice accepted Charlie's answer. Charlie rep: {charlie.reputation}")
354      print(f"Alice rep after accepting: {alice.reputation}")
355  
356      # Search with different strategies
357      print("\n--- Search by relevance ---")
358      platform.set_search_strategy(RelevanceStrategy())
359      results = platform.search("HashMap")
360      for r in results:
361          print(f"  {r}")
362  
363      print("\n--- Search by votes ---")
364      platform.set_search_strategy(VoteStrategy())
365      results = platform.search("HashMap")
366      for r in results:
367          print(f"  {r}")
368  
369      # Try voting without enough reputation
370      print("\n--- Permission check ---")
371      try:
372          platform.vote(charlie.user_id, q2, True)
373          print(f"Charlie voted (rep: {charlie.reputation})")
374      except PermissionError as e:
375          print(f"Expected error: {e}")
376  
377      # Close question (need enough rep)
378      alice.add_reputation(400)
379      platform.close_question(alice.user_id, q1.question_id)
380      print(f"\nAlice closed q1. State: {q1.state.state_name()}")
381  
382      # Try adding answer to closed question
383      try:
384          platform.post_answer(bob.user_id, q1.question_id, "Too late!")
385      except PermissionError as e:
386          print(f"Expected error: {e}")
387  
388      # Final state
389      print(f"\nFinal reputations:")
390      print(f"  Alice: {alice.reputation}")
391      print(f"  Bob: {bob.reputation}")
392      print(f"  Charlie: {charlie.reputation}")
393  
394      assert q1.state.state_name() == "CLOSED"
395      assert a1.accepted is True
396      assert charlie.reputation > 1
397      assert len(q1.answers) == 2
398      print("\nAll assertions passed.")

Common Mistakes

✗Letting anyone vote without reputation thresholds. Sock puppet accounts and vote manipulation become trivial.
✗Computing reputation on the fly from vote history every time. This is expensive. Store it as a running total and update incrementally.
✗Hardcoding search ranking logic. Relevance, votes, and date sorting should be swappable strategies, not if-else chains.
✗No separation between vote action and reputation effect. Changing the reputation formula means hunting through every place votes are cast.

Key Points

✓Reputation is the central mechanic. It gates who can vote, comment, edit, and close questions. Without thresholds, the system has no quality control.
✓Observer pattern propagates reputation changes. When a vote happens, both the voter and the content author get reputation adjustments.
✓Strategy pattern for search ranking. Users switch between relevance, votes, and date sorting without changing the search infrastructure.
✓State pattern governs question lifecycle. Open questions accept answers. Closed questions reject them. The state object enforces this cleanly.

Related Designs