GDPR & CCPA Data Architecture
The Engineering Problem
GDPR and CCPA are not just legal documents. They are data architecture constraints. When a user submits a deletion request, you have 30 days (GDPR) or 45 days (CCPA) to remove their personal data from every system that holds it. That sounds simple until you realize their email address is in your PostgreSQL database, your Elasticsearch cluster, your Redshift warehouse, your Segment event stream, your Mailchimp audience, your Stripe records, and three months of CloudWatch logs.
Designing for Deletion
The key architectural decision is indirection. Instead of scattering PII across every system, use a pattern where a single user_id reference points to a centralized identity store:
- Canonical PII store - One service owns the user's personal data (name, email, address). All other systems reference only the opaque
user_id. - Deletion propagation - When a deletion request arrives, the identity service publishes a
UserDeletedevent. Every downstream system subscribes and purges its references. - Soft-delete with crypto-shredding - For data that is expensive to physically delete (backups, append-only logs), encrypt PII with a per-user key. Deletion means destroying the key, rendering the data unreadable.
Consent Architecture
Consent is not a boolean. It is a versioned, per-purpose state machine. A user might consent to marketing emails but not analytics tracking. They might withdraw consent for data sharing but keep their account active.
Build a consent ledger that records every consent grant and withdrawal with timestamps, the specific purpose, and the policy version the user agreed to. Expose this ledger via an internal API so every service can check canProcess(userId, purpose) before handling data. When consent is withdrawn, trigger the same event-driven cleanup as deletion, but scoped to the specific processing purpose.
Data Lineage and Retention
You cannot comply with erasure requests if you do not know where data lives. Implement data lineage tracking that maps every data flow: which services ingest PII, where it gets replicated, and how long each copy is retained.
Retention policies should be enforced by automation, not by documentation. Use TTLs in your data warehouse, lifecycle policies in S3, and scheduled purge jobs that run weekly. Every data store should have an owner and a documented retention period. If you cannot explain why you are keeping data and for how long, you probably should not be keeping it.
Cross-Border Data Transfers
GDPR restricts transferring EU personal data outside the European Economic Area. After the Schrems II ruling invalidated Privacy Shield, the primary mechanism is Standard Contractual Clauses (SCCs) supplemented by transfer impact assessments. On the architecture side, consider regional data residency: deploy EU-specific infrastructure, configure database replicas within EU regions, and use geo-routing to ensure EU user data stays in EU data centers. This is complex but increasingly necessary as enforcement intensifies.
Key Points
- •Right to erasure requires you to actually find and delete all copies of a user's data, including backups, logs, analytics, and third-party systems
- •Consent must be granular, freely given, and withdrawable. A single 'I agree' checkbox is not GDPR-compliant
- •Data lineage tracking is essential: you cannot delete what you cannot find
- •Retention policies must be enforced automatically, not just documented
- •CCPA gives California residents the right to know, delete, and opt out of data sales, with a 45-day response window
Common Mistakes
- ✗Storing PII in unindexed log files where it cannot be found or deleted for erasure requests
- ✗Treating consent as a one-time event rather than a mutable state that users can change at any time
- ✗Building deletion logic that only removes data from the primary database but not from caches, search indices, or data warehouse
- ✗Not distinguishing between data controller and data processor responsibilities when using third-party services