[RFC PATCH v1 00/11] Landlock: Namespace and capability control

Christian Brauner brauner at kernel.org
Wed Mar 25 12:34:31 UTC 2026


On Thu, Mar 12, 2026 at 11:04:33AM +0100, Mickaël Salaün wrote:
> Namespaces are a fundamental building block for containers and
> application sandboxes, but user namespace creation significantly widens
> the kernel attack surface.  CVE-2022-0185 (filesystem mount parsing),
> CVE-2022-25636 and CVE-2023-32233 (netfilter), and CVE-2022-0492 (cgroup
> v1 release_agent) all demonstrate vulnerabilities exploitable only
> through capabilities gained via user namespaces.  Some distributions
> block user namespace creation entirely, but this removes a useful
> isolation primitive.  Fine-grained control allows trusted programs to
> use namespaces while preventing unnecessary exposure for programs that
> do not need them.
> 
> Existing mechanisms (user.max_*_namespaces sysctls, userns_create LSM
> hook, PR_SET_NO_NEW_PRIVS, and capset) each address part of this threat
> but none provides per-process, fine-grained control over both namespace
> types and capabilities.  Container runtimes resort to seccomp-based
> clone/unshare filtering, but seccomp cannot dereference clone3's flag
> structure, forcing runtimes to block clone3 entirely.
> 
> Landlock's composable layer model enables several patterns: a user
> session manager can restrict namespace types and capabilities broadly
> while allowing trusted programs to create the namespaces they need, and
> each deeper layer can further restrict the allowed set.  Container
> runtimes can similarly deny namespace creation inside managed
> containers.
> 
> This series adds two new permission categories to Landlock:
> 
> - LANDLOCK_PERM_NAMESPACE_ENTER: Restricts which namespace types a
>   sandboxed process can acquire: both creation (unshare/clone) and entry
>   (setns).  User namespace creation has no capability check in the
>   kernel, so this is the only enforcement mechanism for that entry
>   point.
> 
> - LANDLOCK_PERM_CAPABILITY_USE: Restricts which Linux capabilities a
>   sandboxed process can use, regardless of how they were obtained
>   (including through user namespace creation).
> 
> Both use new handled_perm and LANDLOCK_RULE_* constants following the
> existing allow-list model.  The UAPI uses raw CAP_* and CLONE_NEW*
> values directly; unknown values are silently accepted for forward
> compatibility (the allow-list denies them by default).  The Landlock ABI
> version is bumped from 8 to 9.
> 
> The handled_perm infrastructure is designed to be reusable by future
> permission categories.  The last patch documents the design rationale
> for the permission model and the criteria for choosing between
> handled_access_*, handled_perm, and scoped.  A patch series to add
> socket creation control is under review [2]; it could benefit from the
> same permission model to achieve complete deny-by-default coverage of
> socket creation.
> 
> This series builds on Christian Brauner's namespace LSM blob RFC [1],
> included as patch 1.
> 
> Christian, could you please review patch 3?  It adds a FOR_EACH_NS_TYPE
> X-macro to ns_common_types.h and derives CLONE_NS_ALL, replacing inline
> CLONE_NEW* flag enumerations in nsproxy.c and fork.c.

This all looks good to me, thanks! I'd really love to see this go in.



More information about the Linux-security-module-archive mailing list