File Storage (Dropbox)
Chunked uploads, content-addressed storage, versioning. The win is deduplication — two users uploading the same file only store the chunks once.
Key Abstractions
User-facing entity — owner, path, versions, current head
A snapshot of the file — list of chunk hashes, timestamp, size
Fixed-size blob (4MB typical). Content-addressed by SHA-256 of its bytes.
Low-level store keyed by chunk hash. Typically S3 or equivalent.
Tree of folders, files, and versions. The source of truth for listings.
Orchestrates chunking, dedup, upload, download; also exposes delete/undelete/purge for the trash flow
Class Diagram
The Key Insight
Dropbox scales because it doesn't store files — it stores chunks keyed by their content hash. Two users uploading the same PDF only store the PDF's chunks once. The same file saved twice adds zero bytes to cold storage. This single design choice is worth billions in infrastructure cost.
Separation of metadata and blob stores is the second lever. Metadata — who owns what, path trees, version history — is small, mutates often, and needs transactions. Blobs are large, immutable, and eventually consistent. Treating them as one database is a recipe for suffering. A Postgres-for-metadata / S3-for-blobs split is the industry default.
The third insight is immutable versions. A version is a list of chunk hashes, not a mutation of prior state. Restoring v5 after editing to v7 creates v8 that shares chunks with v5. No data is ever overwritten, which means corruption, ransomware, and "I deleted the wrong file" all become recoverable.
Requirements
Functional
- Upload a file, chunk it, store only unique chunks
- Download the head version or any earlier version
- Restore to an earlier version (creates a new head pointing to old chunks)
- Resume a partial upload from the last acknowledged chunk
- Multiple users can store the same content with zero duplication
Non-Functional
- Storage cost scales with unique content, not total uploads
- Download parallelizable (fetch chunks concurrently)
- Metadata updates are transactional; blob writes are eventually consistent
- Bit-exact recovery for any stored version
Design Decisions
Why content-addressed storage?
Deduplication is the economic moat. With path-keyed storage, the same bytes uploaded by 100k users cost 100k × filesize. Content-addressed, it costs filesize × 1. At petabyte scale this is the difference between a profitable product and bankruptcy.
Why 4MB chunks?
Small chunks give better dedup (a single-byte edit only rewrites one chunk). Large chunks amortize per-chunk overhead (HTTP round-trip, hash computation, index entry). 4MB is Dropbox's observed sweet spot; Git uses variable-length chunks via rolling hash for even better dedup.
Why SHA-256 and not MD5?
MD5 collisions are trivial to generate. If two files share an MD5, one silently overwrites the other — catastrophic. SHA-256 collisions have never been demonstrated. The extra CPU cost is negligible next to the network I/O.
Why immutable versions?
Mutating a file in place means the old bytes are gone the instant the write completes. Users delete the wrong thing, malware encrypts everything, a bug corrupts data — and without versioning, there's no "restore to yesterday." Immutability makes recovery a metadata operation.
Why resumable uploads?
A 10GB file over a flaky mobile connection fails halfway every time. Chunked uploads with per-chunk ACK let the client resume from the last confirmed chunk. Dropbox's success on mobile hinges on this.
Interview Follow-ups
- "How would you share a file?" Add a
Shareentity:(fileId, sharedWith, permission). When Bob accesses a file shared by Alice, the system looks up the share, grants read-only access to the same chunks. No data copy. - "What about conflict resolution when two clients edit offline?" Three options: last-writer-wins (simple, loses data), conflict copy (Alice's file becomes
demo (conflict - Alice).txt), or CRDT merge (complex, only works for structured formats). - "How would real-time sync work?" A long-poll or WebSocket per client listening for metadata changes on watched paths. Server pushes deltas; client pulls only the new chunks.
- "How do you handle massive files (1TB+)?" Multipart upload with parallel chunk uploads. Metadata entry created upfront; chunks trickle in. Finalize builds the version record once all chunks land.
- "How does end-to-end encryption fit?" Client-side AES-GCM encryption before hashing. Dedup breaks (ciphertext differs per encryption) unless convergent encryption is used — key derived from content hash.
Code Implementation
1 from __future__ import annotations
2 from abc import ABC, abstractmethod
3 from dataclasses import dataclass, field
4 from datetime import datetime
5 from threading import RLock
6 import hashlib
7 import uuid
8
9
10 CHUNK_SIZE = 4 * 1024 * 1024 # 4 MB — matches Dropbox's real-world size
11
12
13 @dataclass(frozen=True)
14 class Chunk:
15 hash: str
16 size: int
17 data: bytes
18
19
20 def chunk_hash(data: bytes) -> str:
21 return hashlib.sha256(data).hexdigest()
22
23
24 class BlobStore(ABC):
25 """Low-level content-addressed storage."""
26
27 @abstractmethod
28 def put(self, hash_: str, data: bytes) -> None: ...
29
30 @abstractmethod
31 def get(self, hash_: str) -> bytes: ...
32
33 @abstractmethod
34 def exists(self, hash_: str) -> bool: ...
35
36
37 class InMemoryBlobStore(BlobStore):
38 def __init__(self):
39 self._blobs: dict[str, bytes] = {}
40 self._lock = RLock()
41 self.put_calls = 0 # for dedup demonstration
42
43 def put(self, hash_: str, data: bytes) -> None:
44 with self._lock:
45 self.put_calls += 1
46 if hash_ not in self._blobs:
47 self._blobs[hash_] = data
48
49 def get(self, hash_: str) -> bytes:
50 with self._lock:
51 if hash_ not in self._blobs:
52 raise KeyError(f"chunk {hash_[:8]} missing")
53 return self._blobs[hash_]
54
55 def exists(self, hash_: str) -> bool:
56 with self._lock:
57 return hash_ in self._blobs
58
59 def unique_blob_count(self) -> int:
60 with self._lock:
61 return len(self._blobs)
62
63
64 @dataclass
65 class Version:
66 id: str
67 size: int
68 created_at: datetime
69 chunk_hashes: list[str]
70 author_id: str
71
72
73 @dataclass
74 class File:
75 id: str
76 owner: str
77 path: str
78 versions: list[Version] = field(default_factory=list)
79 head_version_id: str = ""
80 deleted: bool = False # soft-delete (trash)
81 deleted_at: datetime | None = None
82
83
84 class MetadataStore:
85 def __init__(self):
86 self._by_path: dict[str, File] = {} # (owner, path) -> File — simplified to path
87 self._by_id: dict[str, File] = {}
88 self._lock = RLock()
89
90 def get_or_create(self, owner: str, path: str) -> File:
91 with self._lock:
92 key = f"{owner}:{path}"
93 existing = self._by_path.get(key)
94 if existing is not None and not existing.deleted:
95 return existing
96 if existing is not None and existing.deleted:
97 # Re-uploading to a soft-deleted path resurrects it with history intact.
98 existing.deleted = False
99 existing.deleted_at = None
100 return existing
101 f = File(id=str(uuid.uuid4())[:8], owner=owner, path=path)
102 self._by_path[key] = f
103 self._by_id[f.id] = f
104 return f
105
106 def add_version(self, file_id: str, version: Version) -> None:
107 with self._lock:
108 f = self._by_id[file_id]
109 f.versions.append(version)
110 f.head_version_id = version.id
111
112 def get_file(self, file_id: str) -> File:
113 with self._lock:
114 return self._by_id[file_id]
115
116 def list(self, owner: str, include_trash: bool = False) -> list[File]:
117 with self._lock:
118 return [
119 f for k, f in self._by_path.items()
120 if k.startswith(f"{owner}:") and (include_trash or not f.deleted)
121 ]
122
123 def trash(self, owner: str) -> list[File]:
124 with self._lock:
125 return [f for k, f in self._by_path.items() if k.startswith(f"{owner}:") and f.deleted]
126
127 def mark_deleted(self, file_id: str) -> None:
128 with self._lock:
129 f = self._by_id[file_id]
130 if f.deleted:
131 return
132 f.deleted = True
133 f.deleted_at = datetime.utcnow()
134
135 def restore(self, file_id: str) -> None:
136 with self._lock:
137 f = self._by_id[file_id]
138 f.deleted = False
139 f.deleted_at = None
140
141 def purge(self, file_id: str) -> list[str]:
142 """Permanently remove metadata. Returns chunk hashes that *might* be orphaned."""
143 with self._lock:
144 f = self._by_id.pop(file_id, None)
145 if f is None:
146 return []
147 self._by_path.pop(f"{f.owner}:{f.path}", None)
148 return [h for v in f.versions for h in v.chunk_hashes]
149
150
151 class Uploader:
152 def __init__(self, blob: BlobStore, meta: MetadataStore, chunk_size: int = CHUNK_SIZE):
153 self._blob = blob
154 self._meta = meta
155 self._chunk_size = chunk_size
156
157 def upload(self, owner: str, path: str, content: bytes) -> Version:
158 file = self._meta.get_or_create(owner, path)
159 chunks = self._split(content)
160 hashes = []
161 for data in chunks:
162 h = chunk_hash(data)
163 hashes.append(h)
164 # Skip the upload if this chunk already exists — dedup across users and versions.
165 if not self._blob.exists(h):
166 self._blob.put(h, data)
167 version = Version(
168 id=str(uuid.uuid4())[:8],
169 size=len(content),
170 created_at=datetime.utcnow(),
171 chunk_hashes=hashes,
172 author_id=owner,
173 )
174 self._meta.add_version(file.id, version)
175 return version
176
177 def download(self, file_id: str, version_id: str | None = None) -> bytes:
178 file = self._meta.get_file(file_id)
179 if file.deleted and version_id is None:
180 raise LookupError("file is in trash — restore it first, or request a version explicitly")
181 target = next(
182 (v for v in file.versions if version_id is None and v.id == file.head_version_id
183 or v.id == version_id),
184 None,
185 )
186 if target is None:
187 raise LookupError("version not found")
188 return b"".join(self._blob.get(h) for h in target.chunk_hashes)
189
190 def delete_file(self, file_id: str) -> None:
191 """Soft delete — file moves to trash but chunks stay put. Reversible."""
192 self._meta.mark_deleted(file_id)
193
194 def undelete_file(self, file_id: str) -> None:
195 """Restore a soft-deleted file. Head version is preserved."""
196 self._meta.restore(file_id)
197
198 def purge_file(self, file_id: str) -> int:
199 """Permanent delete. Returns number of chunks released for GC (best-effort)."""
200 chunk_hashes = self._meta.purge(file_id)
201 # In-memory demo doesn't ref-count chunks across files; return the list size.
202 return len(chunk_hashes)
203
204 def restore(self, file_id: str, version_id: str) -> Version:
205 """Roll the head back to an earlier version by creating a new version that copies the old chunks."""
206 file = self._meta.get_file(file_id)
207 source = next((v for v in file.versions if v.id == version_id), None)
208 if source is None:
209 raise LookupError("version not found")
210 restored = Version(
211 id=str(uuid.uuid4())[:8],
212 size=source.size,
213 created_at=datetime.utcnow(),
214 chunk_hashes=list(source.chunk_hashes),
215 author_id=file.owner,
216 )
217 self._meta.add_version(file.id, restored)
218 return restored
219
220 def _split(self, content: bytes) -> list[bytes]:
221 if not content:
222 return [b""]
223 return [content[i:i + self._chunk_size] for i in range(0, len(content), self._chunk_size)]
224
225
226 if __name__ == "__main__":
227 blob = InMemoryBlobStore()
228 meta = MetadataStore()
229 uploader = Uploader(blob, meta, chunk_size=16) # tiny chunks for demo
230
231 # Upload a file.
232 content_v1 = b"Hello, this is a demo of chunked upload with dedup."
233 v1 = uploader.upload("alice", "demo.txt", content_v1)
234 print(f"v1: {v1.id} — {len(v1.chunk_hashes)} chunks, {blob.unique_blob_count()} unique blobs")
235
236 # Upload again, unchanged. Metadata creates a new version but no new blobs.
237 v2 = uploader.upload("alice", "demo.txt", content_v1)
238 print(f"v2 (same content): {blob.unique_blob_count()} unique blobs still — full dedup")
239
240 # Edit the tail of the file. Only the changed chunks upload.
241 content_v3 = b"Hello, this is a demo of chunked upload with dedup + SMALL CHANGE"
242 v3 = uploader.upload("alice", "demo.txt", content_v3)
243 print(f"v3 (edit): {blob.unique_blob_count()} unique blobs, put_calls={blob.put_calls}")
244
245 # Another user uploads the same v1 content. Zero new blobs.
246 before = blob.unique_blob_count()
247 uploader.upload("bob", "shared.txt", content_v1)
248 print(f"Bob's upload added {blob.unique_blob_count() - before} blobs (should be 0)")
249
250 # Download the head and a specific version.
251 file_id = meta.list("alice")[0].id
252 head = uploader.download(file_id)
253 rollback = uploader.download(file_id, version_id=v1.id)
254 print(f"Head size: {len(head)}, v1 size: {len(rollback)}")
255 assert head == content_v3
256 assert rollback == content_v1
257
258 # Restore v1 as the new head.
259 uploader.restore(file_id, v1.id)
260 print(f"After version restore, head == v1 content: {uploader.download(file_id) == content_v1}")
261
262 # Trash → restore → purge.
263 uploader.delete_file(file_id)
264 assert meta.get_file(file_id).deleted
265 print(f"After delete, visible files: {len(meta.list('alice'))}; trash: {len(meta.trash('alice'))}")
266 try:
267 uploader.download(file_id)
268 except LookupError as e:
269 print(f"Download on trashed file blocked: {e}")
270
271 uploader.undelete_file(file_id)
272 assert not meta.get_file(file_id).deleted
273 print(f"After undelete, visible files: {len(meta.list('alice'))}")
274
275 # Purge is permanent; only chunks Bob also uses stay alive, but in-memory store doesn't ref-count.
276 purged = uploader.purge_file(file_id)
277 print(f"Purged {purged} chunk reference(s); file reachable: "
278 f"{file_id in [f.id for f in meta.list('alice', include_trash=True)]}")Common Mistakes
- ✗Storing full files on every upload. At scale, duplication destroys the cost model.
- ✗Letting the metadata store hold the bytes. Postgres doesn't want to be a blob store.
- ✗Locking the whole file on write. Conflicting writes should produce a conflict copy, not block.
- ✗Weak hashing (MD5, CRC). Collisions are rare but catastrophic — two different files sharing a chunk corrupts both.
Key Points
- ✓Content-addressed storage: hash the chunk, key the blob by hash. Same bytes uploaded twice = one blob.
- ✓Chunking enables resumable uploads. A failed 100MB upload resumes from chunk N, not byte 0.
- ✓Metadata and blob stores are separate. Metadata is small and transactional; blobs are huge and eventually consistent.
- ✓Versions are immutable lists of chunk hashes. Restore a version by pointing file.head at the old version ID.