[PATCH 00/34] fs: idmapped mounts

Fri Oct 30 16:03:32 UTC 2020

On Fri, Oct 30, 2020 at 10:07:48AM -0500, Seth Forshee wrote:
> On Thu, Oct 29, 2020 at 11:37:23AM -0500, Eric W. Biederman wrote:
> > First and foremost: A uid shift on write to a filesystem is a security
> > bug waiting to happen.  This is especially in the context of facilities
> > like iouring, that play very agressive games with how process context
> > makes it to  system calls.
> > 
> > The only reason containers were not immediately exploitable when iouring
> > was introduced is because the mechanisms are built so that even if
> > something escapes containment the security properties still apply.
> > Changes to the uid when writing to the filesystem does not have that
> > property.  The tiniest slip in containment will be a security issue.
> > 
> > This is not even the least bit theoretical.  I have seem reports of how
> > shitfs+overlayfs created a situation where anyone could read
> > /etc/shadow.
> 
> This bug was the result of a complex interaction with several
> contributing factors. It's fair to say that one component was overlayfs
> writing through an id-shifted mount, but the primary cause was related
> to how copy-up was done coupled with allowing unprivileged overlayfs
> mounts in a user ns. Checks that the mounter had access to the lower fs
> file were not done before copying data up, and so the file was copied up
> temporarily to the id shifted upperdir. Even though it was immediately
> removed, other factors made it possible for the user to get the file
> contents from the upperdir.
> 
> Regardless, I do think you raise a good point. We need to be wary of any
> place the kernel could open files through a shifted mount, especially
> when the open could be influenced by userspace.
> 
> Perhaps kernel file opens through shifted mounts should to be opt-in.
> I.e. unless a flag is passed, or a different open interface used, the
> open will fail if the dentry being opened is subject to id shifting.
> This way any kernel writes which would be subject to id shifting will
> only happen through code which as been written to take it into account.

For my use cases, it would be fine to require opt-in at original fs
mount time by init_user_ns admin.  I.e.
    mount -o allow_idmap /dev/mapper/whoozit /whatzit
I'm quite certain I would always be sharing a separate LV or loopback or
tmpfs.

-serge