[PATCH v2 2/4] capabilities: Add securebit to restrict userns caps

Mon Jun 10 09:46:06 UTC 2024

On Sun, Jun 09, 2024 at 09:33:01PM GMT, Serge E. Hallyn wrote:
> On Sun, Jun 09, 2024 at 03:43:35AM -0700, Jonathan Calmels wrote:
> > This patch adds a new capability security bit designed to constrain a
> > task’s userns capability set to its bounding set. The reason for this is
> > twofold:
> > 
> > - This serves as a quick and easy way to lock down a set of capabilities
> >   for a task, thus ensuring that any namespace it creates will never be
> >   more privileged than itself is.
> > - This helps userspace transition to more secure defaults by not requiring
> >   specific logic for the userns capability set, or libcap support.
> > 
> > Example:
> > 
> >     # capsh --secbits=$((1 << 8)) --drop=cap_sys_rawio -- \
> >             -c 'unshare -r grep Cap /proc/self/status'
> >     CapInh: 0000000000000000
> >     CapPrm: 000001fffffdffff
> >     CapEff: 000001fffffdffff
> >     CapBnd: 000001fffffdffff
> >     CapAmb: 0000000000000000
> >     CapUNs: 000001fffffdffff
> 
> But you are not (that I can see, in this or the previous patch)
> keeping SECURE_USERNS_STRICT_CAPS in securebits on the next
> level unshare.  Though I think it's ok, because by then both
> cap_userns and cap_bset are reduced and cap_userns can't be
> expanded.  (Sorry, just thinking aloud here)

Right this is safe to reset, but maybe we do keep it if the secbit is
locked? This is kind of a special case compared to the other bits.

> > +	/* Limit userns capabilities to our parent's bounding set. */
> 
> In the case of userns_install(), it will be the target user namespace
> creator's bounding set, right?  Not "our parent's"?

Good point, I should reword this comment.