Audit Logging Architecture
Why Audit Logs Are Different
Application logs help engineers debug issues. Audit logs help auditors, security teams, and regulators understand who did what. These serve fundamentally different purposes and should be treated as separate systems with distinct schemas, storage, retention policies, and access controls.
Every audit log entry should capture a consistent set of fields:
- Actor - Who performed the action (user ID, service account, API key identifier)
- Action - What was done (created, read, updated, deleted, exported, approved)
- Target - Which resource was affected (resource type + ID)
- Timestamp - When it happened (UTC, millisecond precision)
- Source - Where the request originated (IP address, user agent, geographic location)
- Outcome - Whether the action succeeded or failed (and why)
- Context - Request ID, session ID, and any relevant metadata for correlation
Immutability and Tamper-Evidence
Regulators require proof that audit logs have not been altered. There are several architectural approaches to immutability:
- Write-once storage - S3 with Object Lock in Compliance mode, or Azure Immutable Blob Storage. Once written, entries cannot be modified or deleted, even by root accounts.
- Hash chains - Each log entry includes a cryptographic hash of the previous entry, creating a blockchain-like chain. Any modification breaks the chain and is detectable.
- Separate write and read accounts - The service that writes audit logs has no delete permissions. The team that reads audit logs has no write permissions. This separation of duties prevents any single actor from both creating and modifying entries.
- Third-party attestation - Services like AWS CloudTrail or Chronicle provide independent, vendor-managed audit trails that your team cannot tamper with.
Querying Patterns
Auditors ask specific questions: "Show me all admin privilege escalations in Q3" or "Who accessed customer X's data in the last 90 days?" Your audit log system must support efficient queries across these dimensions.
Recommended architecture: Write audit events to an append-only event stream (Kafka, Kinesis), then fan out to two destinations:
- Hot storage (0-90 days) - Elasticsearch or ClickHouse for fast, ad-hoc queries during incident investigations and security reviews.
- Cold storage (90 days to retention limit) - S3 with Object Lock, partitioned by date and queryable via Athena or Presto for audit-period queries.
What to Log and What Not to Log
Always log: Authentication events (login, logout, MFA challenges), authorization decisions (access granted, access denied), data access (reads of sensitive resources), data mutations (creates, updates, deletes), configuration changes (IAM policy updates, feature flag changes), and administrative actions (user provisioning, role assignments).
Never log: Passwords, authentication tokens, session cookies, encryption keys, full credit card numbers, or unmasked PII. If you must reference PII in audit logs for context, use tokenized or hashed identifiers. The audit log itself must not become a data protection liability.
Alerting on Audit Events
Audit logs are not just for retrospective review. High-risk events should trigger real-time alerts:
- Multiple failed login attempts from the same actor
- Privilege escalation events (user granted admin role)
- Bulk data export or access events
- Configuration changes to security-critical systems
- Access to sensitive resources outside business hours
Pipe these events through a SIEM (Splunk, Sentinel, Elastic Security) or build lightweight alerting rules on your event stream. The goal is to detect anomalous behavior in minutes, not during the next quarterly access review.
Key Points
- •Audit logs answer who did what, to which resource, when, and from where. Every entry must include actor, action, target, timestamp, and source IP
- •Immutability is non-negotiable: audit logs must be append-only and tamper-evident to satisfy regulatory requirements
- •Separate audit logs from operational logs. They serve different audiences (auditors vs. engineers) and have different retention requirements
- •Structured logging with consistent schemas enables efficient querying across millions of events
- •Retention periods vary by regulation: SOC 2 requires 1 year, PCI DSS requires 1 year, HIPAA requires 6 years, GDPR has no fixed minimum but expects proportionality
Common Mistakes
- ✗Mixing audit logs with application debug logs in the same stream, making audit queries slow and unreliable
- ✗Logging sensitive data (passwords, tokens, PII) in audit entries, which turns audit logs themselves into a compliance liability
- ✗Using mutable storage for audit logs where entries can be modified or deleted by administrators
- ✗Not testing audit log completeness. Running an audit and discovering gaps in coverage is a control failure