[PATCH v2] xattr: Enable security.capability in user namespaces

Serge E. Hallyn serge at hallyn.com
Thu Jul 13 19:48:42 UTC 2017


Quoting Eric W. Biederman (ebiederm at xmission.com):
> Stefan Berger <stefanb at linux.vnet.ibm.com> writes:
> 
> > On 07/13/2017 01:14 PM, Eric W. Biederman wrote:
> >> Theodore Ts'o <tytso at mit.edu> writes:
> >>
> >>> On Thu, Jul 13, 2017 at 07:11:36AM -0500, Eric W. Biederman wrote:
> >>>> The concise summary:
> >>>>
> >>>> Today we have the xattr security.capable that holds a set of
> >>>> capabilities that an application gains when executed.  AKA setuid root exec
> >>>> without actually being setuid root.
> >>>>
> >>>> User namespaces have the concept of capabilities that are not global but
> >>>> are limited to their user namespace.  We do not currently have
> >>>> filesystem support for this concept.
> >>> So correct me if I am wrong; in general, there will only be one
> >>> variant of the form:
> >>>
> >>>     security.foo at uid=15000
> >>>
> >>> It's not like there will be:
> >>>
> >>>     security.foo at uid=1000
> >>>     security.foo at uid=2000
> >>>
> >>> Except.... if you have an Distribution root directory which is shared
> >>> by many containers, you would need to put the xattrs in the overlay
> >>> inodes.  Worse, each time you launch a new container, with a new
> >>> subuid allocation, you will have to iterate over all files with
> >>> capabilities and do a copy-up operations on the xattrs in overlayfs.
> >>> So that's actually a bit of a disaster.
> >>>
> >>> So for distribution overlays, you will need to do things a different
> >>> way, which is to map the distro subdirectory so you know that the
> >>> capability with the global uid 0 should be used for the container
> >>> "root" uid, right?
> >>>
> >>> So this hack of using security.foo at uid=1000 is *only* useful when the
> >>> subcontainer root wants to create the privileged executable.  You
> >>> still have to do things the other way.
> >>>
> >>> So can we make perhaps the assertion that *either*:
> >>>
> >>>     security.foo
> >>>
> >>> exists, *or*
> >>>
> >>>     security.foo at uid=BAR
> >>>
> >>> exists, but never both?  And there BAR is exclusive to only one
> >>> instances?
> >>>
> >>> Otherwise, I suspect that the architecture is going to turn around and
> >>> bite us in the *ss eventually, because someone will want to do
> >>> something crazy and the solution will not be scalable.
> >> Yep.  That is what it looks like from here.
> >>
> >> Which is why I asked the question about scalability of the xattr
> >> implementations.  It looks like trying to accomodate the general
> >> case just gets us in trouble, and sets unrealistic expectations.
> >>
> >> Which strongly suggests that Serge's previous version that
> >> just reved the format of security.capable so that a uid field could
> >> be added is likely to be the better approach.
> >>
> >> I want to see what Serge and Stefan have to say but the case looks
> >> pretty clear cut at the moment.

I'm fine with that.  Now, we'll be doing the enforcement at xattr
write time, meaning someone *can* come up with an fs image with >1
such xattrs.  Which is *fine*, I believe, it won't break anything
security-wise, and our goal is only to stop users from thinking it
is legitimate two write multiple such xattrs, so that they don't later
bug the fs folks like Ted saying "hey why can't I write 1000 of these,
I think that's a bug."

So at xattr write time,

	1. if there is already an xattr, and it is either the global
	non-namespaced xattr, or it has kuid=X where X is the kuid
	mapped to root in a parent of the container, then we refuse
	the write
	2. if there is already an xattr, and it is for a kuid=X where
	X is mapped into the container, then we overwrite the existing
	xattr.

At read/use time, we use the rules we have now.

Does that seem reasonable?

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-security-module" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



More information about the Linux-security-module-archive mailing list