Namespaces provide isolation between containers and the host system. see or interfere with other containers or with host processes.| Namespace | Isolates | Example |
|---|---|---|
PID | Process IDs | Each container has its own independent process tree |
NET | Network interfaces | Containers have separate virtual Ethernet devices |
MNT | Mount points / filesystems | Separate view of file system mounts |
IPC | Inter-process communication | Shared memory and semaphores are isolated |
UTS | Hostname & domain name | Containers set their own host names |
USER | User and group IDs | Maps container users to host UID/GIDs on host |
Control groups (cgroups) control how much CPI, memory, disk I/O, and network bandwidth a container can use.Namespaces isolate visibility and cgroups limit impact. --cap-drop to remove unnecessary kernel privileges (e.g., CAP_SYS_ADMIN).--read-only prevents write access to the root filesystem, limiting impact of compromise.USER directive in Dockerfile to prevent privilege escalation inside containers.docker scan, Trivy, and Clair help detect outdated or vulnerable packages.--network host can expose the container to the host network stack, reducing isolation.--publish judiciously to expose only necessary ports.bridge or overlay networks to isolate traffic between containers, enabling service segmentation.AppArmor, seccomp, etc.--privileged lifts all security restrictions, giving container full host access (devices, kernel modules, etc.).-v /:/host gives container access to the host’s entire filesystem, which can be exploited./var/run/docker.sock allows the container to control the Docker daemon — effectively full root access.--security-opt apparmor=docker-default to apply a restrictive policy.ptrace, keyctl). Custom profiles allow more granular control.CAP_NET_RAW to block raw socket use).--cap-add and --cap-drop to follow the principle of least privilege.| Feature | AppArmor / SELinux | seccomp |
|---|---|---|
| Type | Mandatory Access Control (MAC) | System Call Filtering |
| Scope | Controls access to files, network, processes, etc. | Controls access to specific system calls (syscalls) |
| Goal | Limit what a process/container can access or interact with | Limit what a process can ask the kernel to do |
| Granularity | File paths, network, IPC, user/group ID, etc. | Individual syscalls like clone, ptrace, mount, etc. |
| How it works | Applies a profile that labels and restricts access | Uses a syscall filter list to allow or deny operations |
| Docker Usage | --security-opt apparmor=profile_name or SELinux label=... | Enabled by default in Docker with a default profile |
| Example | Deny write to /etc/shadow or opening raw sockets | Deny syscalls like keyctl, mount, or ptrace |
| OS Dependency | AppArmor (Ubuntu), SELinux (RHEL, Fedora, CentOS) | Works across most Linux distributions |
--cap-drop=ALL
1
2
3
4
docker run -it --rm ubuntu bash
apt update
apt install -y iputils-ping
ping -c 4 www.google.com
1
2
3
4
docker run -it --rm --cap-drop=ALL ubuntu bash
apt update
apt install -y iputils-ping
ping -c 4 www.google.com