Apparmor move_mount mediation breaks mount tool in containers
Christian Brauner
brauner at kernel.org
Wed Dec 6 14:12:53 UTC 2023
On Tue, Dec 05, 2023 at 10:34:33AM -0800, John Johansen wrote:
> On 12/5/23 09:08, Christian Brauner wrote:
> > On Tue, Dec 05, 2023 at 09:45:35AM +0100, Linux regression tracking (Thorsten Leemhuis) wrote:
> > > [CCing Linus]
> > >
> > > Hi, top-posting for once, to make this easily accessible to everyone.
> > >
> > > Stéphane, many thx for your insights.
> > >
> > > I'm CCing Linus on this, as I guess he wants to be aware of this.
> > >
> > > Normally I try to sum up things somewhat when doing so, but this here
> > > likely won't end well. So all I'm saying is: commit 157a3537d6bc28
> > > ("apparmor: Fix regression in mount mediation") [v6.7-rc1] that was also
> > > included in v6.6.3 broke containers in some setups. John described that
> > > commit with the words "it is a far from good solution. It is a stop
> > > gap." and later mentioned that "a CVE is coming for this and that having
> > > this unmediated in the tree isn't good either."
> > >
> > > For more details please check out the thread that on lore started with
> > > this message:
> > >
> > > https://lore.kernel.org/all/CA+enf=u0UmgjKrd98EYkxFu7FYV8dR1SBYJn_1b0Naq=3twbbQ@mail.gmail.com/
> > >
> > > Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> > > --
> > > Everything you wanna know about Linux kernel regression tracking:
> > > https://linux-regtracking.leemhuis.info/about/#tldr
> > > If I did something stupid, please tell me, as explained on that page.
> > >
> > > On 05.12.23 07:57, Stéphane Graber wrote:
> > > > On Mon, Dec 4, 2023 at 9:20 AM Linux regression tracking (Thorsten
> > > > Leemhuis) <regressions at leemhuis.info> wrote:
> > > > >
> > > > > On 04.12.23 14:14, John Johansen wrote:
> > > > > > On 12/3/23 17:34, Stéphane Graber wrote:
> > > > >
> > > > > Side note: Stéphane, thx for CCing the regressions list.
> > > > >
> > > > > > > On Sun, Dec 3, 2023 at 8:21 PM John Johansen
> > > > > > > <john.johansen at canonical.com> wrote:
> > > > > > > > On 12/2/23 17:20, Stéphane Graber wrote:
> > > > > > > > >
> > > > > > > > > Upstream commit 157a3537d6bc28ceb9a11fc8cb67f2152d860146 which just
> > > > > > > > > landed in 6.6.3 stable as 96af45154a0be30485ad07f70f852b1456cb13d7 is
> > > > > > > > > blocking new mounts for all LXC, LXD and Incus users (at least) on
> > > > > > > > > distributions using the newer version of util-linux.
> > > > > > > > >
> > > > > > > > > That's because for a simple mount like "mount -t tmpfs tmpfs /tmp",
> > > > > > > > > the new mount command now performs:
> > > > > > > > > ```
> > > > > > > > > fsconfig(3, FSCONFIG_SET_STRING, "source", "tmpfs", 0) = 0
> > > > > > > > > fsconfig(3, FSCONFIG_CMD_CREATE, NULL, NULL, 0) = 0
> > > > > > > > > fsmount(3, FSMOUNT_CLOEXEC, 0) = 4
> > > > > > > > > statx(4, "", AT_STATX_SYNC_AS_STAT|AT_EMPTY_PATH, STATX_MNT_ID,
> > > > > > > > > {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID,
> > > > > > > > > stx_attributes=STATX_ATTR_MOUNT_ROOT, stx_mode=S_IFDIR|S_ISVTX|0777,
> > > > > > > > > stx_size=40, ...}) = 0
> > > > > > > > > move_mount(4, "", AT_FDCWD, "/tmp", MOVE_MOUNT_F_EMPTY_PATH) = 0
> > > > > > > > > ```
> > > > > > > > >
> > > > > > > > > That last call to "move_mount" is incorrectly interpreted by AppArmor
> > > > > > > > > as an attempt to move-mount "/" to "/mnt" rather than as a new mount
> > > > > > > > > being created, this therefore results in:
> > > > > > > > > ```
> > > > > > > > > Dec 03 01:05:03 kernel-test kernel: audit: type=1400
> > > > > > > > > audit(1701565503.599:34): apparmor="DENIED" operation="mount"
> > > > > > > > > class="mount" info="failed perms check" error=-13
> > > > > > > > > profile="incus-a_</var/lib/incus>" name="/tmp/" pid=2190 comm="mount"
> > > > > > > > > srcname="/" flags="rw, move"
> > > > > > > > > ```
> >
> > move_mount() doesn't create a new mount it just moves an already
> > existing mount around in the tree. Either that mount is moved
> > (fsmount(), open_tree(OPEN_TREE_CLONE)) from it's source location or it
> > was a detached mount to begin with.
> >
> yes, all to aware of this
>
> > The new mount api splits mounting across multiple system calls. And the
> > state it keeps for itself is gone once you create a mount via fsmount().
>
> yep
>
> > Because then the fs_context will be destroyed and you're only dealing
> > with a superblock and mount(s) for it.
> >
> yep
>
> > You shouldn't care about move_mount() moving a detached mount because
> > really it is not a move from some other filesystem location. The
> > creation of that mount will have already happened in
> > open_tree(OPEN_TREE_CLONE) or in fsmount(). So that ship has sailed and
> > move_mount() is the wrong place to do this.
> >
> yes and no. What we are specifically dong is mediating existing attached
> mounts as always and blocking move_mount from attaching the detached
> mount via move_mount. Which is very much something move_mount mediation
> should be able to do.
Yes, but my point is that it's often at the wrong point in time. You
can mediate this earlier as well - see my open_tree() comment below.
>
> It works as a stop gap for the new mount api bypassing the LSM at a very
> course level. This is why it requires a very general mount rule for
> apparmor to allow move_mount of the detached mount. A conditional that
> improves control around this is coming, but we can't make move_mount()
> mediation provide the full set of apparmor old mount mediation.
>
> Providing something close to what we were doing before is going to
> require new hooks, and likely specialized states tailored to the
> new mount api. Until that does happen it does mean a reduction in
> what apparmor policy can mediate. Today instead of being able to
> specify details about the mount you need a generic mount rule that
> just allows the application to do pretty much anything with mounts,
> soon you will be able to have an addition conditional that allows
> better control of move mount, and better logging of what is going on.
If you want to start doing full mediation for the new mount api you're
likely missing quite a few pieces.
I expect you need a hook in at least fsmount() as this always creates a
detached mount for a new filesystem context (You touched on that in an
earlier mail.).
But you also need a new hook in open_tree() but _only_ in the
OPEN_TREE_CLONE path because that creates detached mounts as well.
And you will also very likely want a new hook in mount_setattr() because
that allows you to change mount properties as well and the creation of
idmapped mounts and so on.
More information about the Linux-security-module-archive
mailing list