Printer Spooler
Multi-printer queue with priorities and paper/toner constraints. A priority queue per printer plus a job router handles the routing; cancellation and retries handle the rest.
Key Abstractions
What to print — pages, copies, duplex, priority, submitting user
One physical device. Owns its queue, paper/toner state, and current job.
Priority queue of jobs for a printer
Strategy picking which printer a new job goes to — round-robin, least-loaded, nearest
Facade. Submit, cancel, status. Manages printer pool and router.
Class Diagram
The Key Insight
The classic trap is one global queue feeding all printers. It sounds elegant but it's a coordination disaster: every printer contends on the same queue, a stalled printer holds jobs it can't finish, and fairness gets fuzzy fast. The real-world design — one queue per printer plus a router that picks on submit — decouples the printers completely. If printer 1 jams, printer 2 keeps going.
Priority isn't a feature to bolt on; it's in the data model. An enum gives compile-time safety (no priority = 47 weirdness) and a natural comparator. FIFO within a priority level keeps things fair — two HIGH jobs submitted in order print in order.
The consumable model (paper, toner) matters more than it looks. A printer with 3 sheets of paper and a 50-page job should reject the submission, not start printing and fail at page 3. Surfacing is_available gives the router a clean signal to route elsewhere, and an ops dashboard a clean signal to dispatch a refill tech.
Requirements
Functional
- Submit jobs with document, copies, priority
- Route to one of several printers based on policy
- Cancel queued jobs (but not the currently printing one)
- Track paper and toner; reject new jobs on low supplies
- Support priority ordering and fair FIFO within a priority
Non-Functional
- Printers process in parallel (one queue each)
- Cancellation is O(log n) — flag-and-skip
- Router strategy swappable (least-loaded, round-robin, nearest)
- Status queryable by job ID
Design Decisions
Why per-printer queues?
One global queue creates a chokepoint. Per-printer queues let devices run independently — one printer's problem stays local. The router absorbs the "pick a printer" decision at submit time, which is the right moment to make it (we know current loads, user's office location, etc.).
Why Priority as an enum?
An integer priority tempts everyone to invent levels — 7, 9, 42 — and debugging why "priority 42 came before priority 40" wastes hours. Three enum values (HIGH, NORMAL, LOW) match how humans actually think about print urgency and stay readable in logs.
Why flag-and-skip for cancellation?
A priority queue doesn't support arbitrary-key removal efficiently. Marking a job as cancelled and dropping it on poll is O(1) to mark, O(1) amortized to skip. Cleaner than rebuilding the heap.
Why can't we cancel the currently-printing job?
The page is literally halfway through the rollers. "Cancelling" would mean ejecting a half-printed page — achievable but not worth the hardware complexity. Real spoolers model this: cancellation only affects jobs still in the queue.
Why separate JobRouter from SpoolService?
Router is a policy. Service is the machinery. A building with one printer per floor wants nearest-floor routing. A print shop wants least-loaded for throughput. Same SpoolService, different router — no forks or feature flags required.
Interview Follow-ups
- "How would you support printer groups (color printers only, A3-capable only)?" Add
capabilities: Set<Capability>to Printer andrequires: Set<Capability>to PrintJob. Router filters by capability match first. - "What about user quotas (max 500 pages/month)?" Observer on job completion updates a per-user counter. Submit consults the counter before routing; quota exhaustion returns a clear error.
- "How do you handle a jammed printer?" Printer state machine: READY → BUSY → JAMMED. Router skips JAMMED printers. Jam-clear event moves back to READY. Queued jobs stay in place; admins can rebalance.
- "How would secure printing work?" Job marked
requiresPin. Printer holds it in queue but doesn't auto-process. User enters PIN at the device; matched jobs release for that printer only. - "What if we add a print server cluster?" Each server node owns a subset of printers (sharding by printer ID). The SpoolService API sits behind a load balancer. Status lookups route via consistent hashing on job ID.
Code Implementation
from __future__ import annotations
from abc import ABC, abstractmethod
from dataclasses import dataclass, field
from datetime import datetime
from enum import Enum
from threading import RLock
from typing import Callable
import heapq
import itertools
import uuid
class Priority(Enum):
HIGH = 1
NORMAL = 2
LOW = 3
class JobStatus(Enum):
QUEUED = "queued"
PRINTING = "printing"
DONE = "done"
FAILED = "failed"
CANCELLED = "cancelled"
@dataclass
class Document:
name: str
page_count: int
_seq = itertools.count()
@dataclass
class PrintJob:
id: str
user_id: str
document: Document
copies: int = 1
duplex: bool = False
priority: Priority = Priority.NORMAL
status: JobStatus = JobStatus.QUEUED
created_at: datetime = field(default_factory=datetime.utcnow)
seq: int = field(default_factory=lambda: next(_seq))
def total_pages(self) -> int:
return self.copies * self.document.page_count
def __lt__(self, other: "PrintJob") -> bool:
# Lower priority enum value wins (HIGH=1 before NORMAL=2). FIFO tie-break.
return (self.priority.value, self.seq) < (other.priority.value, other.seq)
class PrintQueue:
def __init__(self):
self._heap: list[PrintJob] = []
self._index: dict[str, PrintJob] = {}
self._lock = RLock()
def offer(self, job: PrintJob) -> None:
with self._lock:
heapq.heappush(self._heap, job)
self._index[job.id] = job
def poll(self) -> PrintJob | None:
with self._lock:
while self._heap:
candidate = heapq.heappop(self._heap)
# Drop cancelled jobs silently.
if candidate.status == JobStatus.CANCELLED:
self._index.pop(candidate.id, None)
continue
self._index.pop(candidate.id, None)
return candidate
return None
def size(self) -> int:
with self._lock:
return sum(1 for j in self._heap if j.status == JobStatus.QUEUED)
def cancel(self, job_id: str) -> bool:
with self._lock:
job = self._index.get(job_id)
if job is None or job.status != JobStatus.QUEUED:
return False
job.status = JobStatus.CANCELLED
return True
class Printer:
def __init__(self, printer_id: str, model: str, paper: int = 500, toner: int = 10_000):
self.id = printer_id
self.model = model
self._queue = PrintQueue()
self._paper = paper # sheets available
self._toner = toner # arbitrary units
self._current: PrintJob | None = None
self._lock = RLock()
@property
def queue_size(self) -> int:
return self._queue.size()
@property
def is_available(self) -> bool:
# Available means "accepting jobs" — low supplies block new submissions.
with self._lock:
return self._paper > 10 and self._toner > 100
def submit(self, job: PrintJob) -> None:
if not self.is_available:
raise RuntimeError(f"Printer {self.id} not accepting jobs — low supplies")
self._queue.offer(job)
def process_next(self) -> PrintJob | None:
job = self._queue.poll()
if job is None:
return None
with self._lock:
self._current = job
job.status = JobStatus.PRINTING
pages_needed = job.total_pages()
if self._paper < pages_needed:
job.status = JobStatus.FAILED
self._current = None
return job
# Simulate consumption.
self._paper -= pages_needed
self._toner -= pages_needed # 1 toner unit per page — rough
job.status = JobStatus.DONE
self._current = None
return job
def refill_paper(self, sheets: int) -> None:
with self._lock:
self._paper += sheets
def refill_toner(self, units: int) -> None:
with self._lock:
self._toner += units
def cancel_queued(self, job_id: str) -> bool:
return self._queue.cancel(job_id)
class JobRouter(ABC):
@abstractmethod
def pick(self, printers: list[Printer], job: PrintJob) -> Printer | None: ...
class LeastLoadedRouter(JobRouter):
def pick(self, printers: list[Printer], job: PrintJob) -> Printer | None:
eligible = [p for p in printers if p.is_available]
return min(eligible, key=lambda p: p.queue_size) if eligible else None
class RoundRobinRouter(JobRouter):
def __init__(self):
self._cursor = 0
self._lock = RLock()
def pick(self, printers: list[Printer], job: PrintJob) -> Printer | None:
eligible = [p for p in printers if p.is_available]
if not eligible:
return None
with self._lock:
idx = self._cursor % len(eligible)
self._cursor += 1
return eligible[idx]
class NotificationBus:
"""Observer bus. Events: submitted, started, completed, failed, cancelled."""
def __init__(self):
self._subs: list[Callable[[str, PrintJob], None]] = []
def subscribe(self, handler: Callable[[str, PrintJob], None]) -> None:
self._subs.append(handler)
def publish(self, event: str, job: PrintJob) -> None:
for h in self._subs:
h(event, job)
class SpoolService:
def __init__(self, printers: list[Printer], router: JobRouter | None = None):
if not printers:
raise ValueError("need at least one printer")
self._printers = printers
self._router = router or LeastLoadedRouter()
self._jobs: dict[str, PrintJob] = {}
self._job_to_printer: dict[str, str] = {}
self._bus = NotificationBus()
def subscribe(self, handler: Callable[[str, PrintJob], None]) -> None:
self._bus.subscribe(handler)
def submit(self, user_id: str, document: Document, copies: int = 1,
priority: Priority = Priority.NORMAL, duplex: bool = False) -> str:
if copies < 1:
raise ValueError("copies must be >= 1")
job = PrintJob(
id=str(uuid.uuid4())[:8],
user_id=user_id,
document=document,
copies=copies,
duplex=duplex,
priority=priority,
)
printer = self._router.pick(self._printers, job)
if printer is None:
raise RuntimeError("No printer available")
printer.submit(job)
self._jobs[job.id] = job
self._job_to_printer[job.id] = printer.id
self._bus.publish("submitted", job)
return job.id
def cancel(self, job_id: str) -> bool:
printer_id = self._job_to_printer.get(job_id)
if printer_id is None:
return False
printer = next(p for p in self._printers if p.id == printer_id)
cancelled = printer.cancel_queued(job_id)
if cancelled:
self._bus.publish("cancelled", self._jobs[job_id])
return cancelled
def status(self, job_id: str) -> JobStatus | None:
job = self._jobs.get(job_id)
return job.status if job else None
def run_once(self) -> int:
"""Each printer drains one job. Returns total processed."""
processed = 0
for printer in self._printers:
job = printer.process_next()
if job is None:
continue
processed += 1
# printer.process_next already set status to DONE or FAILED.
event = "completed" if job.status == JobStatus.DONE else "failed"
self._bus.publish(event, job)
return processed
if __name__ == "__main__":
p1 = Printer("HP-01", "LaserJet", paper=100, toner=500)
p2 = Printer("HP-02", "LaserJet", paper=100, toner=500)
svc = SpoolService([p1, p2], router=LeastLoadedRouter())
events: list[tuple[str, str]] = []
svc.subscribe(lambda ev, job: events.append((ev, job.id)))
ids = []
for i in range(5):
ids.append(svc.submit(f"user-{i}", Document(f"report-{i}.pdf", page_count=10)))
high = svc.submit("urgent-user", Document("critical.pdf", page_count=2), priority=Priority.HIGH)
ids.append(high)
print(f"P1 queue: {p1.queue_size}, P2 queue: {p2.queue_size}")
# Cancel one queued job.
svc.cancel(ids[2])
print(f"Cancelled {ids[2][:8]}, status = {svc.status(ids[2]).value}")
# Drain queues. High-priority job goes first on whichever printer holds it.
while svc.run_once() > 0:
pass
done = sum(1 for i in ids if svc.status(i) == JobStatus.DONE)
cancelled = sum(1 for i in ids if svc.status(i) == JobStatus.CANCELLED)
print(f"Completed: {done}, Cancelled: {cancelled}")
print(f"Paper left — P1: {p1._paper}, P2: {p2._paper}")
# Observer summary.
counts: dict[str, int] = {}
for ev, _ in events:
counts[ev] = counts.get(ev, 0) + 1
print(f"Events fired: {counts}")
assert counts.get("submitted", 0) == len(ids)
assert counts.get("completed", 0) == done
assert counts.get("cancelled", 0) == cancelledCommon Mistakes
- ✗Single global queue with all printers pulling. Heavy contention and unfair routing under load.
- ✗Cancellation that kills the currently-printing job. The sheet of paper is already halfway through — cancel only affects queued jobs.
- ✗Forgetting to decrement paper count per page. Printer runs empty and users see 'printing' with nothing coming out.
- ✗Priority as a float. Use an enum — HIGH, NORMAL, LOW. Comparable, predictable, debuggable.
Key Points
- ✓Priority queue per printer, not a global queue. Printers work in parallel.
- ✓Job state machine: QUEUED → PRINTING → DONE, or QUEUED → CANCELLED, or PRINTING → FAILED.
- ✓Paper/toner modeled as consumables. Low-supply state stops accepting jobs and surfaces an alert.
- ✓Router is a Strategy — round-robin for fairness, least-loaded for throughput, nearest for office floors.