Privacy by Design

The Seven Foundational Principles

Privacy by Design, formalized by Ann Cavoukian and now embedded in GDPR Article 25, rests on seven principles that translate directly to engineering decisions:

Proactive, not reactive - Anticipate privacy risks during design, not after incidents. Include privacy as a non-functional requirement in every technical spec.
Privacy as the default - Systems should protect personal data automatically. Users should not have to take action to protect their privacy. Default sharing settings should be off, not on.
Privacy embedded into design - Privacy is not an add-on feature. It is a core architecture constraint, like security or scalability.
Full functionality (positive-sum) - Privacy and functionality are not trade-offs. Design systems where you can have both, using techniques like differential privacy, on-device processing, and federated learning.
End-to-end security - Data protection across the entire lifecycle: collection, processing, storage, sharing, and deletion.
Visibility and transparency - Users and auditors can verify that privacy controls are working as intended.
Respect for user privacy - User-centric design with granular controls, easy data export, and straightforward deletion.

Data Minimization in Practice

Data minimization is not about collecting less data for the sake of it. It is about reducing attack surface and compliance scope. Every field of personal data you collect is a field you must protect, a field that can be breached, and a field that must be findable for deletion requests.

Practical techniques:

Collect at the point of need - Do not ask for a phone number at registration if you only need it for two-factor authentication. Collect it when the user enables 2FA.
Aggregate early - If you need analytics on user behavior, aggregate events into anonymous counts at ingestion time rather than storing individual user actions.
Ephemeral processing - Process PII in memory and discard it. If you need to verify an ID document, extract the verified status and discard the document image.
Field-level encryption - For PII that must be stored, encrypt individual fields so that database access alone does not expose personal data.

Consent Architecture

Consent is the legal basis for most personal data processing. A well-designed consent system has three components:

Consent collection - Granular, per-purpose consent requests that explain what data is collected, why, and how long it is retained. Presented at the point of data collection, not buried in terms of service.
Consent ledger - An immutable record of every consent grant, withdrawal, and modification. Each entry includes the user ID, purpose, timestamp, policy version, and collection method (web form, API, verbal).
Consent enforcement - An internal API that every service calls before processing personal data: canProcess(userId, purpose) -> boolean. This centralizes consent checks and prevents data processing after consent withdrawal.

Anonymization vs. Pseudonymization

This distinction has real architectural consequences:

Technique	Reversible?	GDPR Applies?	Example
Anonymization	No	No, data is no longer personal	k-anonymity, differential privacy, aggregation
Pseudonymization	Yes (with key/table)	Yes, data is still personal	Hashing, tokenization, encryption

If you hash a user's email with SHA-256, that is pseudonymization. Anyone with the same email can produce the same hash. True anonymization requires that no reasonable effort can re-identify the individual. For analytics, consider techniques like differential privacy (adding calibrated noise to query results) or k-anonymity (ensuring every record is indistinguishable from at least k-1 others).

Privacy Impact Assessments

Every feature that introduces new personal data collection or changes how existing data is processed should go through a Privacy Impact Assessment (PIA). This does not have to be a heavyweight legal process. It can be a section in your design document that answers five questions:

What personal data does this feature collect or process?
What is the legal basis for processing (consent, legitimate interest, contract)?
Who has access to this data, and why?
How long is the data retained, and how is it deleted?
What happens if this data is breached?

Treat these answers as design constraints and enforce them in code reviews.

The Seven Foundational Principles

Privacy by Design, formalized by Ann Cavoukian and now embedded in GDPR Article 25, rests on seven principles that translate directly to engineering decisions:

Proactive, not reactive - Anticipate privacy risks during design, not after incidents. Include privacy as a non-functional requirement in every technical spec.

Privacy as the default - Systems should protect personal data automatically. Users should not have to take action to protect their privacy. Default sharing settings should be off, not on.

Privacy embedded into design - Privacy is not an add-on feature. It is a core architecture constraint, like security or scalability.

Full functionality (positive-sum) - Privacy and functionality are not trade-offs. Design systems where you can have both, using techniques like differential privacy, on-device processing, and federated learning.

End-to-end security - Data protection across the entire lifecycle: collection, processing, storage, sharing, and deletion.

Visibility and transparency - Users and auditors can verify that privacy controls are working as intended.

Respect for user privacy - User-centric design with granular controls, easy data export, and straightforward deletion.

Data Minimization in Practice

Practical techniques:

Collect at the point of need - Do not ask for a phone number at registration if you only need it for two-factor authentication. Collect it when the user enables 2FA.

Aggregate early - If you need analytics on user behavior, aggregate events into anonymous counts at ingestion time rather than storing individual user actions.

Ephemeral processing - Process PII in memory and discard it. If you need to verify an ID document, extract the verified status and discard the document image.

Field-level encryption - For PII that must be stored, encrypt individual fields so that database access alone does not expose personal data.

Consent Architecture

Consent is the legal basis for most personal data processing. A well-designed consent system has three components:

Consent collection - Granular, per-purpose consent requests that explain what data is collected, why, and how long it is retained. Presented at the point of data collection, not buried in terms of service.

Consent ledger - An immutable record of every consent grant, withdrawal, and modification. Each entry includes the user ID, purpose, timestamp, policy version, and collection method (web form, API, verbal).

Consent enforcement - An internal API that every service calls before processing personal data: canProcess(userId, purpose) -> boolean. This centralizes consent checks and prevents data processing after consent withdrawal.

Anonymization vs. Pseudonymization

This distinction has real architectural consequences:

Technique

Reversible?

GDPR Applies?

Example

Anonymization

No, data is no longer personal

k-anonymity, differential privacy, aggregation

Pseudonymization

Yes (with key/table)

Yes, data is still personal

Hashing, tokenization, encryption

Privacy Impact Assessments

What personal data does this feature collect or process?

What is the legal basis for processing (consent, legitimate interest, contract)?

Who has access to this data, and why?

How long is the data retained, and how is it deleted?

What happens if this data is breached?

Treat these answers as design constraints and enforce them in code reviews.

The Seven Foundational Principles

Data Minimization in Practice

Consent Architecture

Anonymization vs. Pseudonymization

Privacy Impact Assessments

Key Points

Common Mistakes

Related Topics

Privacy by Design

The Seven Foundational Principles

Data Minimization in Practice

Consent Architecture

Anonymization vs. Pseudonymization

Privacy Impact Assessments

Key Points

Common Mistakes

Related Topics