FoundationalProblem 18 of 20

FoundationalAdvanced

Expression Evaluator

Interpreter PatternVisitor Pattern

Parse math expressions into an AST, then walk the tree with visitors for evaluation, pretty-printing, or optimization. Interpreter builds the structure, Visitor adds operations without modifying it.

Key Abstractions

Expression

Abstract AST node with accept(visitor) for double dispatch to the right visit method

NumberExpr / BinaryExpr / UnaryExpr

Concrete AST nodes representing literals, binary operations, and unary operations

ExpressionVisitor

Visitor interface with a visit method per node type, enabling new operations without modifying nodes

Parser

Recursive descent parser that tokenizes input and builds an AST respecting operator precedence

Class Diagram

The Key Insight

An expression like 3 + 5 * 2 is not a flat string. It is a tree. The multiplication sits deeper because it binds tighter than addition. Once you see that, two patterns fall into place naturally.

The Interpreter pattern gives you the tree structure. Each node in the AST is a class: NumberExpr for literals, BinaryExpr for operations with two operands, UnaryExpr for prefix operators. A recursive descent parser reads the input string and builds this tree, with each precedence level handled by its own method. parse_expression() handles + and -. parse_term() handles * and /. parse_factor() handles numbers, unary minus, and parenthesized groups. The grammar writes itself into the code.

The Visitor pattern is what makes the tree useful beyond one-shot evaluation. You could put an evaluate() method on every node. That works. But the moment you also want print(), you are editing three classes. Want simplify()? Three more edits. Visitor flips this around. Each operation is its own class with a visit method per node type. New operations never touch the Expression hierarchy. The node classes stay frozen.

Double dispatch ties the two patterns together. When you call expression.accept(visitor), the expression knows its own type and calls visitor.visit_number(), visitor.visit_binary(), or visitor.visit_unary(). No isinstance checks. No type switches. The language's own dispatch mechanism routes to the right method.

Requirements

Functional

Parse arithmetic expressions with +, -, *, / operators and parentheses
Respect standard operator precedence: multiplication and division before addition and subtraction
Support unary minus for negation (e.g., -(3 + 2))
Evaluate parsed expressions to a numeric result
Pretty-print parsed expressions with explicit parentheses showing structure
Handle malformed input with clear error messages

Non-Functional

Adding a new operation (like simplification or derivative computation) should not require changes to any Expression class
The parser should be extensible to new operators by adding precedence levels
Division by zero must produce a clear error, not a silent Infinity or NaN

Design Decisions

Why Visitor over adding methods to Expression nodes?

If you put evaluate() on NumberExpr, BinaryExpr, and UnaryExpr, it works. But then you need pretty-print. Edit all three classes. Then you want a simplification pass. Edit all three again. Every new operation touches every node class.

With Visitor, each new operation is one new class. EvaluateVisitor computes results. PrintVisitor produces strings. A future SimplifyVisitor could fold constant sub-trees. The Expression hierarchy stays closed for modification but open for extension through visitors. That is the open-closed principle doing real work, not just sounding good in a textbook.

Why Interpreter pattern for the AST instead of direct evaluation?

You could call eval() or parse left-to-right and compute as you go. Simple and fast. But you get a number and lose everything else. No structure to inspect. No way to print what was parsed. No way to transform the expression before evaluating.

An AST is data you can walk multiple times. Evaluate it, print it, serialize it, optimize it. The tree is the intermediate representation that makes all downstream operations possible. Without it, you are stuck with a single pass that produces a single answer.

Why recursive descent parsing?

It maps one-to-one to a grammar with precedence levels. parse_expression() handles the lowest precedence (+, -). parse_term() handles higher precedence (*, /). parse_factor() handles the atomic pieces: numbers, unary operators, and parenthesized sub-expressions.

Each level calls the next level down, and the recursion naturally produces the correct tree shape. Adding exponentiation means adding parse_power() between parse_term() and parse_factor(). The structure tells you exactly where new operators slot in.

Why double dispatch in the Visitor?

Single dispatch means calling visitor.visit(expression). Inside that method, you need to check whether the expression is a NumberExpr, BinaryExpr, or UnaryExpr with instanceof. That is a type switch hiding in your visitor, and it breaks every time you add a new node type.

Double dispatch eliminates this. expression.accept(visitor) is the first dispatch (to the right Expression subclass). Inside accept, calling visitor.visit_number(self) is the second dispatch (to the right visitor method). The type system handles both lookups. No conditional logic. No casting. If you add a new node type and forget to handle it in a visitor, the compiler tells you.

Interview Follow-ups

"How would you add support for variables like x + 2?" Introduce a VariableExpr node that holds a variable name. The EvaluateVisitor takes an environment map (variable name to value) and looks up the variable during evaluation. The parser treats identifiers the same way it treats numbers in parse_factor().
"How would you implement operator precedence for a new operator like exponentiation?" Add a parse_power() method between parse_term() and parse_factor(). Exponentiation is right-associative, so parse_power() calls parse_factor() for the left side and recursively calls itself for the right side. Each precedence level is a method, so the slot for new operators is always clear.
"What if you want to simplify expressions like 0 + x or 1 * x?" Write a SimplifyVisitor that returns a new (possibly smaller) AST. When it visits a BinaryExpr with + and one side is NumberExpr(0), it returns the other side. No changes to any existing node or visitor class. Just one new visitor.
"How would you handle syntax errors with useful messages?" Track token positions during tokenization (line number, column offset). When the parser hits an unexpected token, include the position in the error message. Something like "Expected ')' at column 12, found '+'" is far more useful than "parse error".

Code Implementation

  1  from __future__ import annotations
  2  from abc import ABC, abstractmethod
  3  from dataclasses import dataclass
  4  
  5  
  6  # ── AST Nodes (Interpreter Pattern) ──────────────────────────────
  7  
  8  class Expression(ABC):
  9      @abstractmethod
 10      def accept(self, visitor: "ExpressionVisitor"):
 11          ...
 12  
 13  
 14  @dataclass
 15  class NumberExpr(Expression):
 16      value: float
 17  
 18      def accept(self, visitor: "ExpressionVisitor"):
 19          return visitor.visit_number(self)
 20  
 21  
 22  @dataclass
 23  class BinaryExpr(Expression):
 24      left: Expression
 25      op: str
 26      right: Expression
 27  
 28      def accept(self, visitor: "ExpressionVisitor"):
 29          return visitor.visit_binary(self)
 30  
 31  
 32  @dataclass
 33  class UnaryExpr(Expression):
 34      op: str
 35      operand: Expression
 36  
 37      def accept(self, visitor: "ExpressionVisitor"):
 38          return visitor.visit_unary(self)
 39  
 40  
 41  # ── Visitor Interface + Concrete Visitors ────────────────────────
 42  
 43  class ExpressionVisitor(ABC):
 44      @abstractmethod
 45      def visit_number(self, expr: NumberExpr):
 46          ...
 47  
 48      @abstractmethod
 49      def visit_binary(self, expr: BinaryExpr):
 50          ...
 51  
 52      @abstractmethod
 53      def visit_unary(self, expr: UnaryExpr):
 54          ...
 55  
 56  
 57  class EvaluateVisitor(ExpressionVisitor):
 58      def visit_number(self, expr: NumberExpr) -> float:
 59          return expr.value
 60  
 61      def visit_binary(self, expr: BinaryExpr) -> float:
 62          left = expr.left.accept(self)
 63          right = expr.right.accept(self)
 64          if expr.op == "+":
 65              return left + right
 66          if expr.op == "-":
 67              return left - right
 68          if expr.op == "*":
 69              return left * right
 70          if expr.op == "/":
 71              if right == 0:
 72                  raise ZeroDivisionError(
 73                      f"Division by zero: {left} / {right}"
 74                  )
 75              return left / right
 76          raise ValueError(f"Unknown operator: {expr.op}")
 77  
 78      def visit_unary(self, expr: UnaryExpr) -> float:
 79          operand = expr.operand.accept(self)
 80          if expr.op == "-":
 81              return -operand
 82          raise ValueError(f"Unknown unary operator: {expr.op}")
 83  
 84  
 85  class PrintVisitor(ExpressionVisitor):
 86      def visit_number(self, expr: NumberExpr) -> str:
 87          return str(expr.value)
 88  
 89      def visit_binary(self, expr: BinaryExpr) -> str:
 90          left = expr.left.accept(self)
 91          right = expr.right.accept(self)
 92          return f"({left} {expr.op} {right})"
 93  
 94      def visit_unary(self, expr: UnaryExpr) -> str:
 95          operand = expr.operand.accept(self)
 96          return f"({expr.op}{operand})"
 97  
 98  
 99  # ── Tokenizer + Parser ──────────────────────────────────────────
100  
101  @dataclass
102  class Token:
103      type: str   # NUMBER, OP, LPAREN, RPAREN
104      value: str
105  
106  
107  class Parser:
108      def __init__(self, text: str):
109          self.tokens = self._tokenize(text)
110          self.pos = 0
111  
112      def _tokenize(self, text: str) -> list[Token]:
113          tokens = []
114          i = 0
115          while i < len(text):
116              ch = text[i]
117              if ch.isspace():
118                  i += 1
119                  continue
120              if ch in "+-*/":
121                  tokens.append(Token("OP", ch))
122                  i += 1
123              elif ch == "(":
124                  tokens.append(Token("LPAREN", ch))
125                  i += 1
126              elif ch == ")":
127                  tokens.append(Token("RPAREN", ch))
128                  i += 1
129              elif ch.isdigit() or ch == ".":
130                  start = i
131                  while i < len(text) and (text[i].isdigit() or text[i] == "."):
132                      i += 1
133                  tokens.append(Token("NUMBER", text[start:i]))
134              else:
135                  raise ValueError(f"Unexpected character: {ch}")
136          return tokens
137  
138      def _peek(self) -> Token | None:
139          if self.pos < len(self.tokens):
140              return self.tokens[self.pos]
141          return None
142  
143      def _consume(self) -> Token:
144          token = self.tokens[self.pos]
145          self.pos += 1
146          return token
147  
148      def parse(self) -> Expression:
149          expr = self._parse_expression()
150          if self.pos < len(self.tokens):
151              raise ValueError(
152                  f"Unexpected token after expression: {self.tokens[self.pos].value}"
153              )
154          return expr
155  
156      def _parse_expression(self) -> Expression:
157          """Handles + and - (lowest precedence)."""
158          left = self._parse_term()
159          while (tok := self._peek()) and tok.type == "OP" and tok.value in "+-":
160              op = self._consume().value
161              right = self._parse_term()
162              left = BinaryExpr(left, op, right)
163          return left
164  
165      def _parse_term(self) -> Expression:
166          """Handles * and / (higher precedence)."""
167          left = self._parse_factor()
168          while (tok := self._peek()) and tok.type == "OP" and tok.value in "*/":
169              op = self._consume().value
170              right = self._parse_factor()
171              left = BinaryExpr(left, op, right)
172          return left
173  
174      def _parse_factor(self) -> Expression:
175          """Handles numbers, unary minus, and parenthesized groups."""
176          tok = self._peek()
177          if tok is None:
178              raise ValueError("Unexpected end of expression")
179  
180          # Unary minus
181          if tok.type == "OP" and tok.value == "-":
182              self._consume()
183              operand = self._parse_factor()
184              return UnaryExpr("-", operand)
185  
186          # Parenthesized sub-expression
187          if tok.type == "LPAREN":
188              self._consume()
189              expr = self._parse_expression()
190              closing = self._peek()
191              if closing is None or closing.type != "RPAREN":
192                  raise ValueError("Missing closing parenthesis")
193              self._consume()
194              return expr
195  
196          # Number literal
197          if tok.type == "NUMBER":
198              self._consume()
199              return NumberExpr(float(tok.value))
200  
201          raise ValueError(f"Unexpected token: {tok.value}")
202  
203  
204  if __name__ == "__main__":
205      evaluator = EvaluateVisitor()
206      printer = PrintVisitor()
207  
208      # Simple addition
209      tree = Parser("3 + 5").parse()
210      print(f"Expression: 3 + 5")
211      print(f"  Evaluate: {tree.accept(evaluator)}")
212      print(f"  Print:    {tree.accept(printer)}")
213  
214      # Operator precedence: * binds tighter than +
215      tree = Parser("2 + 3 * 4").parse()
216      print(f"\nExpression: 2 + 3 * 4")
217      print(f"  Evaluate: {tree.accept(evaluator)}")
218      print(f"  Print:    {tree.accept(printer)}")
219  
220      # Parentheses override precedence
221      tree = Parser("(2 + 3) * 4").parse()
222      print(f"\nExpression: (2 + 3) * 4")
223      print(f"  Evaluate: {tree.accept(evaluator)}")
224      print(f"  Print:    {tree.accept(printer)}")
225  
226      # Unary minus with compound expression
227      tree = Parser("-(3 + 2) * 4 / 2").parse()
228      print(f"\nExpression: -(3 + 2) * 4 / 2")
229      print(f"  Evaluate: {tree.accept(evaluator)}")
230      print(f"  Print:    {tree.accept(printer)}")
231  
232      # Division by zero
233      print(f"\nExpression: 10 / (5 - 5)")
234      try:
235          tree = Parser("10 / (5 - 5)").parse()
236          tree.accept(evaluator)
237      except ZeroDivisionError as e:
238          print(f"  Caught: {e}")

Common Mistakes

✗Putting evaluate() directly on Expression nodes. It works until you need a second operation. Then you edit every node class again.
✗Ignoring operator precedence in the parser. 2 + 3 * 4 should be 14, not 20.
✗Not handling division by zero. The EvaluateVisitor should check the denominator before dividing.
✗Using instanceof checks instead of the Visitor pattern. That defeats the purpose and scatters operation logic.

Key Points

✓Interpreter defines the grammar as a class hierarchy. Each node type is a rule in the grammar.
✓Visitor adds new operations to the AST without modifying any Expression class. Open-closed principle.
✓Double dispatch: Expression.accept(visitor) calls visitor.visit_X(self), routing to the correct visit method.
✓Parser handles operator precedence through recursive descent: addition/subtraction at one level, multiplication/division at another.

Related Designs