SELinux & AppArmor
Mental Model
A hospital where every staff member wears a colored badge and every room door lists which badge colors may enter. The janitor carries a master keycard -- root -- that opens any standard lock, but the badge system sits on a separate circuit entirely. Keycard accepted, wrong badge color? Door stays shut. SELinux works like permanent badge stamps on every person and every door. AppArmor works like a clipboard at each door listing permitted names by room number. Neither system cares about the keycard.
The Problem
An attacker exploits a web server bug and lands root. From there, DAC rolls out the red carpet: /etc/shadow is readable, every database file is open, lateral movement to other services takes under 60 seconds. On a shared-kernel host, Container A running as UID 0 reads Container B's database volume because DAC cannot tell one root from another -- 10,000+ user credentials leak. Meanwhile a 200-node Kubernetes cluster split between RHEL and Ubuntu has no unified MAC story, leaving compromised pods on unprotected nodes free to touch any host file they please.
Architecture
Root is supposed to be all-powerful. That is the whole point of root.
But on a properly configured RHEL box, a process running as root inside the httpd_t domain cannot read /etc/shadow. The kernel checks the file permissions -- root passes, of course. Then it checks the SELinux policy. No allow rule exists for httpd_t accessing shadow_t. Access denied. Root gets EACCES.
This is Mandatory Access Control. The kernel enforces rules that even root cannot override. And it is the reason a compromised web server on a MAC-enabled system is an incident, not a catastrophe.
What Actually Happens
When a process tries to open a file, the kernel runs two security checks in sequence.
First: DAC (Discretionary Access Control). Traditional file permissions. Owner, group, other. ACLs. This is the rwx check everyone knows. If DAC denies access, the operation fails immediately. MAC never even runs.
Second: MAC via LSM hooks. If DAC passes, the kernel hits a Linux Security Module hook -- security_inode_permission(). This is where SELinux or AppArmor makes its decision.
For SELinux, the hook does three things: (1) look up the process's security context (e.g., httpd_t), (2) look up the file's security context (e.g., httpd_sys_content_t), (3) check the Access Vector Cache for a rule allowing this specific access. Cache hit takes about 100 nanoseconds. Cache miss triggers a full policy database lookup.
For AppArmor, the hook looks up the profile for the current binary and checks whether the file path matches an allowed pattern. /var/www/** with read permission? Allowed. /etc/shadow? Not in the profile. Denied.
Both produce audit log entries on denial. SELinux writes AVC (Access Vector Cache) denial messages. AppArmor writes APPARMOR_DENIED messages.
Under the Hood
SELinux labels everything. Every process, file, socket, port, and IPC object carries a security context in the format user:role:type:level. The type field does the heavy lifting. Type Enforcement rules explicitly allow specific types to interact: allow httpd_t httpd_sys_content_t:file { read open getattr }. No allow rule means default deny. The entire policy is compiled into a binary blob loaded by the kernel at boot.
The AVC caches decisions in a hash table keyed by (source_type, target_type, object_class). This is critical for performance -- without the cache, every file access would require a full policy database traversal.
SELinux booleans are the admin-friendly knobs. setsebool -P httpd_can_network_connect on allows Apache to make outbound network connections. Each boolean toggles a set of policy rules without recompiling the policy. getsebool -a | grep httpd shows all httpd-related booleans.
MCS for container isolation is elegant. Container A gets s0:c1,c2. Container B gets s0:c3,c4. Files written by A are labeled s0:c1,c2. Process B has s0:c3,c4. The categories do not match, so B cannot read A's files -- even if DAC says otherwise.
AppArmor takes the simpler path. Profiles are human-readable text files that list allowed paths with glob patterns, capabilities, and network access rules. A profile for nginx might allow read access to /var/www/**, write to /var/log/nginx/**, and the net_bind_service capability. No labels, no contexts, no type enforcement rules. The tradeoff: hard links bypass path-based rules, and path canonicalization has edge cases.
AppArmor profiles have three modes: enforce (deny and log), complain (allow but log), and kill (SIGKILL on violation). The development workflow: start in complain, run the application through all its code paths, use aa-logprof to generate rules from logs, review, then switch to enforce.
Common Questions
How does SELinux handle a root process trying to read /etc/shadow?
Root has CAP_DAC_OVERRIDE, so the DAC check passes. Then the LSM hook fires. SELinux looks for allow httpd_t shadow_t:file read in the policy. No such rule exists. Access denied with an AVC denial message in the audit log. The process sees EACCES despite running as root. Constraining root is the fundamental value of MAC.
Why do organizations disable SELinux?
Complexity. When a new application fails with AVC denials, the quick fix is setenforce 0. The right fix is: use audit2why to understand the denial, audit2allow -M mymodule to generate a targeted policy module, test in permissive mode, then switch to enforcing. Organizations that invest in SELinux expertise get one of the strongest host security mechanisms available.
Can SELinux and AppArmor run at the same time?
Since kernel 5.4, multiple LSMs can stack via the lsm= boot parameter. But running both simultaneously is not recommended -- their labeling and path-based models conflict, making debugging extremely difficult. In practice, distributions choose one: RHEL/Fedora use SELinux, Ubuntu/Debian use AppArmor.
How does SELinux handle temporary files?
Type transitions. When httpd_t creates a file in a directory labeled httpd_tmp_t, a transition rule automatically labels the new file as httpd_tmp_t. Without transition rules, new files inherit the parent directory's type, which may be too permissive. restorecon relabels files to their default context based on path patterns.
How Technologies Use This
Container A reads Container B's database files because both run as UID 0 with identical Unix permissions. A compromised container traverses to another's storage volume, and standard DAC checks pass because root has CAP_DAC_OVERRIDE. Nothing in the permission model prevents cross-container data access.
The fundamental gap is that Unix DAC permissions are identity-based -- root is root, and there is no distinction between root in Container A and root in Container B. Without Mandatory Access Control, the kernel cannot enforce isolation between two processes that share the same UID and pass the same permission checks.
On SELinux systems, Docker assigns each container a unique MCS category pair like s0:c100,c200. Files written by Container A carry that label, and Container B running with s0:c300,c400 gets EACCES regardless of Unix permissions. On Ubuntu, the docker-default AppArmor profile denies mount, ptrace, and raw network access. The AVC cache resolves these checks in about 100ns, adding negligible overhead while eliminating an entire class of cross-container data leaks.
A compromised pod running as root reads /etc/shadow and accesses any file on the node. The cluster has 200 nodes, half running RHEL with SELinux and the other half Ubuntu with AppArmor, and there is no unified way to enforce mandatory access control across both.
The challenge is that SELinux and AppArmor use fundamentally different models -- labels vs paths -- and operators would need separate security policies per distro. Without a unified abstraction at the Kubernetes layer, enforcing MAC consistently across a heterogeneous cluster is impractical.
Kubernetes exposes seLinuxOptions in the pod security context for SELinux nodes and AppArmor profile annotations for Ubuntu nodes. Pod Security Standards at the restricted tier require every pod to run under a confined MAC profile from whichever system is active. On SELinux nodes, each pod gets unique MCS categories preventing cross-pod file access. On AppArmor nodes, the runtime-default profile blocks mount and ptrace. Either way, a compromised pod cannot read /etc/shadow even as root, reducing post-exploitation damage by over 80%.
Same Concept Across Tech
| Concept | Docker | JVM | Node.js | Go | K8s |
|---|---|---|---|---|---|
| MAC confinement | SELinux MCS labels per container; AppArmor docker-default profile | JVM runs under container's MAC domain; no JVM-specific policy needed | Node runs under container's MAC domain; native addons may trigger denials | Go binaries under container MAC domain; static linking reduces denial surface | seLinuxOptions and appArmorProfile in securityContext per pod |
| Container isolation | MCS categories (s0:c100,c200) prevent cross-container file access | N/A -- JVM does not interact with MAC directly | N/A -- Node does not interact with MAC directly | N/A -- Go does not interact with MAC directly | Pod Security Standards restricted tier requires confined MAC profile |
| Policy development | docker run with custom --security-opt label or apparmor profile | strace + audit2allow to build policy for JVM syscall patterns | complain mode + aa-logprof for Node's file access patterns | audit2allow for Go binary's minimal syscall set | Kubernetes security profiles operator for automated profile generation |
| Debugging denials | ausearch -m AVC for SELinux; dmesg for AppArmor | AVC denials show domain_t accessing target_t -- map to JVM file paths | APPARMOR_DENIED in dmesg shows blocked path -- map to Node require() paths | Same debugging tools; Go's direct syscalls produce cleaner denial logs | kubectl logs + node audit logs for MAC denial correlation |
| Stack Layer | Mechanism |
|---|---|
| Application | Operates transparently -- MAC decisions happen in kernel without app awareness |
| Container runtime | Assigns SELinux MCS categories or loads AppArmor profiles before exec |
| LSM framework | 200+ hook points in VFS, networking, IPC, capabilities invoke registered MAC modules |
| SELinux engine | Loads compiled binary policy at boot; AVC cache handles most decisions in ~100ns |
| AppArmor engine | Compiles path-based profiles at load time; matches file paths against glob patterns |
Design rationale: MAC exists because DAC is identity-based, and once an attacker becomes root, identity-based checks are meaningless. A kernel-enforced policy layer that operates independently of uid/gid means root inside httpd_t still cannot read shadow_t. SELinux went with labels because they survive renames and hard links -- completeness at the cost of complexity. AppArmor went with paths because administrators can read and write profiles in minutes -- simplicity at the cost of edge cases around hard links and path canonicalization.
If You See This, Think This
| Symptom | Likely Cause | First Check |
|---|---|---|
| Application works as root but fails with EACCES | SELinux type enforcement blocking access | ausearch -m AVC -ts today |
| File accessible by path but not after mv | mv preserves source SELinux label; destination context wrong | ls -Z on the file; run restorecon -Rv on the directory |
| Container A can read Container B's files | MCS categories not assigned or identical for both containers | ps -eZ to compare container MCS labels |
| AppArmor profile breaks after directory restructuring | Path-based rules no longer match new file locations | aa-logprof to update profile from denial logs |
| New service fails immediately after deployment on RHEL | No SELinux policy module for the service; default deny blocks everything | setenforce 0 temporarily to confirm; then audit2allow -M to build policy |
| Hard link bypasses AppArmor file restriction | AppArmor matches paths, not inodes; hard link creates new path | Verify with ls -li; consider SELinux for inode-level enforcement |
When to Use / Avoid
- Defense-in-depth beyond DAC and capabilities -- MAC is what stops root-level lateral movement
- Isolating containers from each other via SELinux MCS categories on shared-kernel hosts
- Compliance mandates requiring Mandatory Access Control (PCI-DSS, HIPAA, FedRAMP)
- Confining services to least-privilege file, network, and capability access
- Never disable SELinux entirely to fix an app failure -- switch to permissive and use audit2allow
- Avoid AppArmor during rapid prototyping where file layouts change constantly, since path rules break on reorganization
Try It Yourself
1 # Check which MAC system is active
2
3 cat /sys/kernel/security/lsm 2>/dev/null || echo 'LSM info not available'
4
5 # SELinux: Show current mode and file contexts
6
7 getenforce 2>/dev/null && ls -Z /etc/passwd 2>/dev/null || echo 'SELinux not available'
8
9 # SELinux: Search for AVC denials
10
11 ausearch -m AVC -ts today 2>/dev/null | head -20 || echo 'ausearch not available'
12
13 # SELinux: Generate policy from denials
14
15 ausearch -m AVC -ts today 2>/dev/null | audit2allow 2>/dev/null | head -10 || echo 'audit2allow not available'
16
17 # AppArmor: Show loaded profiles and their status
18
19 aa-status 2>/dev/null | head -20 || echo 'AppArmor not available'
20
21 # AppArmor: Show profile for a specific binary
22
23 cat /etc/apparmor.d/usr.sbin.nginx 2>/dev/null | head -20 || echo 'No nginx AppArmor profile found'Debug Checklist
- 1
cat /sys/kernel/security/lsm -- check which LSM is active on this system - 2
getenforce -- check SELinux mode (Enforcing/Permissive/Disabled) - 3
ausearch -m AVC -ts today | head -20 -- find recent SELinux denials - 4
aa-status -- show loaded AppArmor profiles and their enforcement mode - 5
ls -Z /path/to/file -- view SELinux security context on a file - 6
ps -eZ | grep $PROCESS -- view SELinux domain of a running process
Key Takeaways
- ✓SELinux labels every object (inode-level); AppArmor matches on file paths. This means SELinux survives file renames and hard links (the label stays on the inode), while AppArmor rules break when paths change. The tradeoff: AppArmor profiles are dramatically simpler to write and understand.
- ✓SELinux's type enforcement is default-deny. A rule like 'allow httpd_t httpd_sys_content_t:file { read open getattr }' explicitly permits Apache to read web content files. Without that rule, the access is silently blocked. Every allowed action must be declared.
- ✓MCS (Multi-Category Security) is how container runtimes use SELinux for isolation. Each container gets a unique category pair like s0:c1,c2. Files written by that container are labeled with the same categories. Another container with s0:c3,c4 cannot read them, even with correct DAC permissions.
- ✓AppArmor profiles support file globs (/var/www/** for recursive), owner conditionals, and capability lists. A profile for nginx: allow read /var/www/**, allow write /var/log/nginx/**, deny /etc/shadow, network inet tcp. Compilation happens at profile load time, not on every access.
- ✓Setting SELinux to permissive mode (setenforce 0) logs violations without blocking them -- essential for debugging 'why does my app fail?' But permissive is NOT a security posture. It is a diagnostic tool. Production must run enforcing.
Common Pitfalls
- ✗Mistake: Disabling SELinux entirely because an application fails. Reality: This removes a critical security layer. Use 'audit2allow' to generate policy rules from AVC denials, review them, and apply. The denial messages tell you exactly what rule is missing.
- ✗Mistake: Assuming AppArmor path rules apply to hard links. Reality: If a confined process creates a hard link to /etc/shadow at /tmp/shadow_copy, the rule denying /etc/shadow does not apply to the new path. SELinux handles this correctly because the label is on the inode, not the path.
- ✗Mistake: Moving files instead of copying them and wondering why SELinux breaks. Reality: 'mv' preserves the source label. 'cp' inherits the destination directory's default context. A config file moved from /tmp to /etc/httpd/ keeps its tmp_t label, and httpd cannot read it. Fix with 'restorecon -Rv /etc/httpd/'.
- ✗Mistake: Writing overly broad AppArmor profiles (allowing /** rw) to avoid breakage. Reality: This defeats the purpose of MAC entirely. Start in complain mode ('aa-complain /path/to/profile'), exercise the application, use 'aa-logprof' to generate tight rules from the logs, then switch to enforce.
Reference
In One Line
Turn on MAC -- SELinux on RHEL, AppArmor on Ubuntu -- so that root is no longer a skeleton key and post-compromise lateral movement hits a wall.