Document Builder
Composite document tree with Builder construction, Visitor-based rendering to multiple formats, and Prototype for cloning templates. A textbook example of structural patterns working together.
Key Abstractions
Composite node interface. Every element in the document tree implements accept(visitor) for double-dispatch rendering.
Composite node that contains child elements. Sections can nest other sections, paragraphs, tables, and images.
Leaf elements holding content. They accept visitors but have no children of their own.
Builds a document step-by-step with addSection, addParagraph, addTable. Separates construction from representation.
Interface with visitSection, visitParagraph, visitTable, visitImage. Adding a new output format means one new visitor class.
Concrete visitors that traverse the document tree and produce HTML or plain text output respectively.
Prototype that holds a pre-configured document structure. Clone it with deep copy to create customizable instances.
Class Diagram
How It Works
A document is a tree. Sections contain other sections, paragraphs, tables, and images. That maps directly to the Composite pattern: every element implements a common interface, and sections hold a list of children that are themselves elements. You can render a section the same way you render a paragraph: call accept(visitor) and let polymorphism do the rest.
The Builder pattern sits on top of this tree. Instead of manually constructing sections, adding children, and wiring parent-child relationships, you call addSection("Title"), add content, then endSection(). The builder maintains a stack of open sections, so nested structures build naturally. The caller never touches tree internals.
Rendering is where the Visitor pattern earns its keep. Each element knows nothing about HTML or plain text. It only knows how to call visitor.visitParagraph(this) or visitor.visitSection(this). The visitor accumulates output as it traverses the tree. Adding a PDF renderer means writing one new class that implements DocumentVisitor. Zero changes to any element class.
The Prototype pattern handles templates. A "Weekly Status Report" template is a pre-built document tree. When someone needs a new weekly report, they clone the template (deep copy) and customize the placeholders. No need to rebuild the same structure repeatedly, and the original template stays intact.
Requirements
Functional
- Build documents containing sections, paragraphs, tables, and images
- Sections can nest arbitrarily deep (sections within sections)
- Render any document to HTML and plain text output
- Clone document templates to create pre-configured starting points
- Builder provides a fluent API for step-by-step construction
Non-Functional
- Adding a new output format requires only one new visitor class, no changes to elements
- Adding a new element type requires adding one visit method to the visitor interface
- Deep clone must produce fully independent copies with no shared mutable state
- Tree traversal should be O(n) where n is the total number of elements
Design Decisions
Couldn't each element just have a render() method?
If each element had a renderHtml() and renderPlainText() method, adding a third format (PDF, Markdown, LaTeX) would mean editing every element class. With five element types and four formats, that's twenty methods scattered across five files. Visitor inverts this: each format is one class with all its rendering logic in one place. The trade-off is that adding a new element type requires touching every visitor, but in practice, document element types stabilize quickly while output formats keep growing.
What does Builder give you over a plain constructor?
A document can have zero sections or twenty. Some have images, some don't. Tables are optional. A constructor that takes all of these as parameters would be monstrous. Builder lets the caller add exactly what they need in whatever order makes sense. The section stack handles nesting naturally: addSection pushes, endSection pops. The fluent API reads almost like the document itself.
Does a factory work better than Prototype for templates?
A factory would need to know how to construct every template type. As templates multiply (weekly report, quarterly review, incident postmortem), the factory grows linearly. Prototype is simpler: store a fully built document tree as the template, deep-clone it when needed. The caller customizes the clone. No factory class to maintain, no switch statement to extend.
What's wrong with a flat list instead of Composite?
Documents are inherently hierarchical. A section contains subsections which contain paragraphs. A flat list would require markers or indices to represent nesting, and every rendering operation would need to track nesting depth manually. Composite makes the tree structure explicit. Recursive operations like rendering and word counting fall out naturally from the tree traversal.
Interview Follow-ups
- "How would you add PDF export?" Write a
PdfRendererVisitorthat implementsDocumentVisitor. Use a library like ReportLab (Python) or iText (Java) inside the visit methods. The document tree and builder stay untouched. - "What about lazy rendering for large documents?" Instead of building the full output string eagerly, have the visitor write to a stream. Each visit method flushes its chunk immediately, keeping memory constant regardless of document size. The visitor interface stays the same; only the internal buffering strategy changes.
- "How would you handle streaming for very large documents?" Replace the visitor's StringBuilder with a Writer or OutputStream. Visit methods write directly to the stream and flush periodically. For extremely large tables, the Table element could hold a row iterator instead of a materialized list, yielding rows on demand during traversal.
- "What if you need undo/redo in a document editor?" Combine this with Memento and Command. Each builder operation becomes a Command object. The document tree state before each operation is captured as a Memento. The Text Editor LLD uses exactly this approach, and the two designs compose naturally.
Code Implementation
1 from __future__ import annotations
2 from abc import ABC, abstractmethod
3 from typing import List
4 import copy
5
6
7 # ── Composite: Element Interface ──────────────────────────────────
8
9 class DocumentElement(ABC):
10 """Composite node interface. Every element accepts a visitor."""
11
12 @abstractmethod
13 def accept(self, visitor: DocumentVisitor) -> None: ...
14
15 @abstractmethod
16 def get_name(self) -> str: ...
17
18
19 # ── Composite: Leaf Elements ──────────────────────────────────────
20
21 class Paragraph(DocumentElement):
22 def __init__(self, text: str):
23 self.text = text
24
25 def accept(self, visitor: DocumentVisitor) -> None:
26 visitor.visit_paragraph(self)
27
28 def get_name(self) -> str:
29 return "Paragraph"
30
31
32 class Table(DocumentElement):
33 def __init__(self, headers: List[str], rows: List[List[str]]):
34 self.headers = headers
35 self.rows = rows
36
37 def accept(self, visitor: DocumentVisitor) -> None:
38 visitor.visit_table(self)
39
40 def get_name(self) -> str:
41 return "Table"
42
43
44 class Image(DocumentElement):
45 def __init__(self, src: str, alt_text: str):
46 self.src = src
47 self.alt_text = alt_text
48
49 def accept(self, visitor: DocumentVisitor) -> None:
50 visitor.visit_image(self)
51
52 def get_name(self) -> str:
53 return "Image"
54
55
56 # ── Composite: Section (Container) ───────────────────────────────
57
58 class Section(DocumentElement):
59 def __init__(self, title: str):
60 self.title = title
61 self.children: List[DocumentElement] = []
62
63 def add_child(self, element: DocumentElement) -> None:
64 self.children.append(element)
65
66 def accept(self, visitor: DocumentVisitor) -> None:
67 visitor.visit_section(self)
68 for child in self.children:
69 child.accept(visitor)
70
71 def get_name(self) -> str:
72 return f"Section({self.title})"
73
74
75 # ── Visitor Interface ─────────────────────────────────────────────
76
77 class DocumentVisitor(ABC):
78 @abstractmethod
79 def visit_section(self, section: Section) -> None: ...
80
81 @abstractmethod
82 def visit_paragraph(self, paragraph: Paragraph) -> None: ...
83
84 @abstractmethod
85 def visit_table(self, table: Table) -> None: ...
86
87 @abstractmethod
88 def visit_image(self, image: Image) -> None: ...
89
90
91 # ── Concrete Visitors ─────────────────────────────────────────────
92
93 class HtmlRenderer(DocumentVisitor):
94 def __init__(self):
95 self._parts: List[str] = []
96 self._depth = 0
97
98 def visit_section(self, section: Section) -> None:
99 tag = min(self._depth + 1, 6)
100 self._parts.append(f"<h{tag}>{section.title}</h{tag}>")
101 self._parts.append("<div class=\"section\">")
102 self._depth += 1
103
104 for child in section.children:
105 child.accept(self)
106
107 self._depth -= 1
108 self._parts.append("</div>")
109
110 def visit_paragraph(self, paragraph: Paragraph) -> None:
111 self._parts.append(f"<p>{paragraph.text}</p>")
112
113 def visit_table(self, table: Table) -> None:
114 self._parts.append("<table>")
115 self._parts.append("<tr>" + "".join(f"<th>{h}</th>" for h in table.headers) + "</tr>")
116 for row in table.rows:
117 self._parts.append("<tr>" + "".join(f"<td>{c}</td>" for c in row) + "</tr>")
118 self._parts.append("</table>")
119
120 def visit_image(self, image: Image) -> None:
121 self._parts.append(f"<img src=\"{image.src}\" alt=\"{image.alt_text}\" />")
122
123 def get_result(self) -> str:
124 return "\n".join(self._parts)
125
126
127 class PlainTextRenderer(DocumentVisitor):
128 def __init__(self):
129 self._parts: List[str] = []
130 self._indent = 0
131
132 def _prefix(self) -> str:
133 return " " * self._indent
134
135 def visit_section(self, section: Section) -> None:
136 self._parts.append(f"{self._prefix()}{'=' * 40}")
137 self._parts.append(f"{self._prefix()}{section.title.upper()}")
138 self._parts.append(f"{self._prefix()}{'=' * 40}")
139 self._indent += 1
140
141 for child in section.children:
142 child.accept(self)
143
144 self._indent -= 1
145
146 def visit_paragraph(self, paragraph: Paragraph) -> None:
147 self._parts.append(f"{self._prefix()}{paragraph.text}")
148 self._parts.append("")
149
150 def visit_table(self, table: Table) -> None:
151 col_widths = [max(len(h), *(len(r[i]) for r in table.rows))
152 for i, h in enumerate(table.headers)]
153 header_line = " | ".join(h.ljust(w) for h, w in zip(table.headers, col_widths))
154 self._parts.append(f"{self._prefix()}{header_line}")
155 self._parts.append(f"{self._prefix()}{'-+-'.join('-' * w for w in col_widths)}")
156 for row in table.rows:
157 row_line = " | ".join(c.ljust(w) for c, w in zip(row, col_widths))
158 self._parts.append(f"{self._prefix()}{row_line}")
159 self._parts.append("")
160
161 def visit_image(self, image: Image) -> None:
162 self._parts.append(f"{self._prefix()}[Image: {image.alt_text} ({image.src})]")
163 self._parts.append("")
164
165 def get_result(self) -> str:
166 return "\n".join(self._parts)
167
168
169 # ── Builder ───────────────────────────────────────────────────────
170
171 class Document:
172 """The product: a list of root-level document elements."""
173
174 def __init__(self, elements: List[DocumentElement]):
175 self.elements = elements
176
177 def accept(self, visitor: DocumentVisitor) -> None:
178 for element in self.elements:
179 element.accept(visitor)
180
181
182 class DocumentBuilder:
183 """Builds a document tree step-by-step. Sections can be nested."""
184
185 def __init__(self):
186 self._root_elements: List[DocumentElement] = []
187 self._section_stack: List[Section] = []
188
189 def _current_target(self) -> List[DocumentElement]:
190 if self._section_stack:
191 return self._section_stack[-1].children
192 return self._root_elements
193
194 def add_section(self, title: str) -> "DocumentBuilder":
195 section = Section(title)
196 self._current_target().append(section)
197 self._section_stack.append(section)
198 return self
199
200 def end_section(self) -> "DocumentBuilder":
201 if not self._section_stack:
202 raise RuntimeError("No section to end")
203 self._section_stack.pop()
204 return self
205
206 def add_paragraph(self, text: str) -> "DocumentBuilder":
207 self._current_target().append(Paragraph(text))
208 return self
209
210 def add_table(self, headers: List[str], rows: List[List[str]]) -> "DocumentBuilder":
211 self._current_target().append(Table(headers, rows))
212 return self
213
214 def add_image(self, src: str, alt_text: str) -> "DocumentBuilder":
215 self._current_target().append(Image(src, alt_text))
216 return self
217
218 def build(self) -> Document:
219 if self._section_stack:
220 raise RuntimeError("Unclosed sections remain")
221 return Document(self._root_elements)
222
223
224 # ── Prototype: DocumentTemplate ───────────────────────────────────
225
226 class DocumentTemplate:
227 """Prototype. Clone to create pre-configured documents."""
228
229 def __init__(self, name: str, elements: List[DocumentElement]):
230 self.name = name
231 self.elements = elements
232
233 def clone(self) -> "DocumentTemplate":
234 return DocumentTemplate(self.name, copy.deepcopy(self.elements))
235
236 def to_document(self) -> Document:
237 return Document(self.elements)
238
239
240 # ── Demo ──────────────────────────────────────────────────────────
241
242 if __name__ == "__main__":
243 # 1. Build a report using DocumentBuilder
244 print("=== Building Report with DocumentBuilder ===\n")
245 doc = (
246 DocumentBuilder()
247 .add_section("Quarterly Report")
248 .add_paragraph("This report covers Q4 performance metrics and key findings.")
249 .add_table(
250 ["Metric", "Q3", "Q4", "Change"],
251 [
252 ["Revenue", "$1.2M", "$1.5M", "+25%"],
253 ["Users", "50K", "68K", "+36%"],
254 ["Latency", "120ms", "95ms", "-21%"],
255 ],
256 )
257 .add_section("Infrastructure")
258 .add_paragraph("Migrated three services to Kubernetes. P99 latency dropped 21%.")
259 .add_image("infra-diagram.png", "Infrastructure architecture diagram")
260 .end_section()
261 .add_section("Next Steps")
262 .add_paragraph("Focus on automated canary deployments and cost optimization.")
263 .end_section()
264 .end_section()
265 .build()
266 )
267
268 # 2. Render to HTML
269 print("--- HTML Output ---")
270 html_renderer = HtmlRenderer()
271 doc.accept(html_renderer)
272 print(html_renderer.get_result())
273
274 # 3. Render to Plain Text
275 print("\n--- Plain Text Output ---")
276 text_renderer = PlainTextRenderer()
277 doc.accept(text_renderer)
278 print(text_renderer.get_result())
279
280 # 4. Prototype: create a template and clone it
281 print("\n=== Prototype: Cloning a Template ===\n")
282 template_doc = (
283 DocumentBuilder()
284 .add_section("Weekly Status")
285 .add_paragraph("[Replace with status summary]")
286 .add_table(
287 ["Task", "Owner", "Status"],
288 [["Example task", "TBD", "In Progress"]],
289 )
290 .end_section()
291 .build()
292 )
293 template = DocumentTemplate("Weekly Status Template", template_doc.elements)
294
295 # Clone and customize
296 cloned = template.clone()
297 cloned.name = "Week 42 Status"
298 section = cloned.elements[0]
299 if isinstance(section, Section) and section.children:
300 section.children[0] = Paragraph("All migration tasks completed. No blockers.")
301
302 print(f"Original template name: {template.name}")
303 print(f"Cloned document name: {cloned.name}")
304
305 clone_renderer = PlainTextRenderer()
306 cloned.to_document().accept(clone_renderer)
307 print(clone_renderer.get_result())
308
309 print("\nAll operations completed successfully.")Common Mistakes
- ✗Putting rendering logic inside element classes. Adding a new format like PDF means editing every single element class, which violates Open/Closed.
- ✗Skipping Composite and handling sections vs. paragraphs vs. tables with separate code paths and type-checking scattered everywhere.
- ✗Forgetting deep clone in Prototype. Shared mutable children mean modifying a cloned document silently corrupts the original template.
- ✗Building documents with constructors instead of Builder. Optional sections become awkward and callers must know the full construction sequence.
Key Points
- ✓Builder separates construction from representation: the same build process creates different document structures without telescoping constructors.
- ✓Composite tree lets you treat sections and leaves uniformly with recursive rendering. No special-casing for containers vs. content.
- ✓Visitor adds new operations (HTML, PDF, word count) without modifying element classes. Double-dispatch keeps elements closed for modification.
- ✓Prototype for templates avoids reconstructing common layouts from scratch. Deep clone keeps the original template immutable.