[PATCH 0/3] Introduce user namespace capabilities

Jonathan Calmels jcalmels at 3xx0.net
Fri May 17 11:42:02 UTC 2024


> > > On Thu May 16, 2024 at 10:07 PM EEST, Casey Schaufler wrote:
> > > > I suggest that adding a capability set for user namespaces is a bad idea:
> > > > 	- It is in no way obvious what problem it solves
> > > > 	- It is not obvious how it solves any problem
> > > > 	- The capability mechanism has not been popular, and relying on a
> > > > 	  community (e.g. container developers) to embrace it based on this
> > > > 	  enhancement is a recipe for failure
> > > > 	- Capabilities are already more complicated than modern developers
> > > > 	  want to deal with. Adding another, special purpose set, is going
> > > > 	  to make them even more difficult to use.

Sorry if the commit wasn't clear enough. Basically:

- Today user namespaces grant full capabilities.
  This behavior is often abused to attack various kernel subsystems.
  Only option is to disable them altogether which breaks a lot of
  userspace stuff.
  This goes against the least privilege principle.

- It adds a new capability set.
  This set dictates what capabilities are granted in namespaces (instead
  of always getting full caps).
  This brings namespaces in line with the rest of the system, user
  namespaces are no more "special".
  They now work the same way as say a transition to root does with
  inheritable caps.

- This isn't intended to be used by end users per se (although they could).
  This would be used at the same places where existing capabalities are
  used today (e.g. init system, pam, container runtime, browser
  sandbox), or by system administrators.

To give you some ideas of things you could do:

# E.g. prevent alice from getting CAP_NET_ADMIN in user namespaces under SSH
echo "auth optional pam_cap.so" >> /etc/pam.d/sshd
echo "!cap_net_admin alice" >> /etc/security/capability.conf.

# E.g. prevent any Docker container from ever getting CAP_DAC_OVERRIDE
systemd-run -p CapabilityBoundingSet=~CAP_DAC_OVERRIDE \
            -p SecureBits=userns-strict-caps \
            /usr/bin/dockerd

# E.g. kernel could be vulnerable to CAP_SYS_RAWIO exploits
# Prevent users from ever gaining it
sysctl -w cap_bound_userns_mask=0x1fffffdffff



More information about the Linux-security-module-archive mailing list