[PATCH] selftests/landlock: Add tests for access through disconnected paths

Mon Jun 23 23:16:55 UTC 2025

On 6/23/25 20:40, Mickaël Salaün wrote:
> On Sun, Jun 22, 2025 at 04:42:49PM +0100, Tingmao Wang wrote:
>> On 6/19/25 12:38, Mickaël Salaün wrote:
>>> On Sat, Jun 14, 2025 at 07:25:02PM +0100, Tingmao Wang wrote:
>>>> [...]
>>>>
>>>> This might need more thinking, but maybe if one of the operands is
>>>> disconnected, we can just let it walk until IS_ROOT(dentry), and also
>>>> collect access for the other path until IS_ROOT(dentry), then call
>>>> is_access_to_paths_allowed() passing in the root dentry we walked to?  (In
>>>> this case is_access_to_paths_allowed will not do any walking and just make
>>>> an access decision.)
>>>
>>> If one side is in a disconnected directory and not the other side, the
>>> rename would be denied by the VFS,
>>
>> Not always, right? For example in the path_disconnected_rename test we did:
> 
> Correct, only the mount point matter.
> 
>>
>> 5051.  ASSERT_EQ(0, renameat(bind_s1d3_fd, file2_name, AT_FDCWD, file1_s2d2))
>>                              ^^^^^^^^^^^^^^^^^^^^^^^^  ^^^^^^^^^^^^^^^^^^^^
>>                              Disconnected              Connected
>>
>> (and it also has the other way)
>>
>> So looks like as long as they are still reached from two fds with two
>> paths that have the same mnt, it will be allowed.  It's just that when we
>> do parent walk we end up missing the mount.  This also means that for this
>> refer check, if after doing the two separate walks (with the disconnected
>> side walking all the way to IS_ROOT), we then walk from mnt again, we
>> would allow the rename if there is a rule on mnt (or its parents) allowing
>> file creation and refers, even if the disconnected side technically now
>> lives outside the file hierarchy under mnt and does not have a parent with
>> a rule allowing file creation.
>>
>> (I'm not saying this is necessary wrong or needs fixing, but I think it's
>> an interesting consequence of the current implementation.)
> 
> Hmm, that's indeed a very subtle side effect.  One issue with the
> current implementation is that if a directory between the mount
> point and the source has REFER, and another directory not part of the
> source hierarchy but part of the disconnected directory's hierarchy has
> REFER and no other directory has REFER, and either the source or the
> destination hierarchy is disconnected between the mount point and the
> directory with the REFER, then Landlock will still deny such
> rename/link.  A directory with REFER initially between the mount point
> and the disconnected directory would also be ignored.  There is also the
> case where both the source and the destination are disconnected.

Sorry, I'm having trouble following this.  Can you maybe give a more
specific example, perhaps with commands?

By "mount point" do you mean the bind mount? If a path has became
disconnected because the directory moved away from under the mountpoint,
and is therefore not covered by any REFER (you said "either the source or
the destination hierarchy is disconnected between the mount point and the
directory with the REFER") wouldn't it make sense for the rename to be
denied?

> 
> I didn't consider such cases with collect_domain_accesses().  I'm
> wondering if this path walk gap should be fixed (instead of applying
> https://lore.kernel.org/all/20250618134734.1673254-1-mic@digikod.net/ )
> or not.  We should not rely on optimization side effects, but I'm not
> sure which behavior would make more sense...  Any though?

I didn't quite understand your example above and how is it possible for us
to end up denying something that should be allowed.  My understanding of
the current implementation is, when either operands are disconnected, it
will walk all the way to the current filesystem's root and stop there.
However, it will then still do the walk from the original bind mount up to
the real root (/), and if there is any REFER rules on that path, we will
still allow the rename.  This means that if the rename still ends up being
denied, then it wouldn't have been allowed in the first place, even if the
path has not become disconnected.

An interesting concrete example I came up with:

/# uname -a
Linux 5610c72ba8a0 6.16.0-rc2-dev #43 SMP ...
/# mkdir /a /b
/# mkdir /a/a1 /b/b1
/# mount -t tmpfs none /a/a1
/# mkdir /a/a1/a11
/# mount --bind /a/a1/a11 /b/b1
/# mkdir /a/a1/a11/a111
/# tree /a /b
/a
`-- a1
    `-- a11
        `-- a111
/b
`-- b1
    `-- a111

7 directories, 0 files
/# cd /b/b1/a111/
/b/b1/a111# mv /a/a1/a11/a111 /a/a1/a12
/b/b1/a111# ls ..  # we're disconnected now
ls: cannot access '..': No such file or directory
/b/b1/a111 [2]# touch /a/a1/a12/file

/b/b1/a111# LL_FS_RO=/:/a/a1 LL_FS_RW=/:/b/b1  /sandboxer ls
Executing the sandboxed command...
file

/b/b1/a111# LL_FS_RO=/:/a/a1 LL_FS_RW=/:/b/b1  /sandboxer mv -v file file2
Executing the sandboxed command...
mv: cannot move 'file' to 'file2': Permission denied
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# This fails because for same dir rename we just use is_access_to_path_allowed,
# which will stop at /a/a1 (and thus never reach either /b/b1 or /).

/b/b1/a111 [1]# mkdir subdir
/b/b1/a111# LL_FS_RO=/:/a/a1 LL_FS_RW=/b/b1  /sandboxer mv -v file subdir/file2
Executing the sandboxed command...
[..] WARNING: CPU: 1 PID: 656 at security/landlock/fs.c:1065 ...
renamed 'file' -> 'subdir/file2'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# This works because now we restart walk from /b/b1 (the bind mnt)

/b/b1/a111# mv subdir/file2 file
/b/b1/a111# LL_FS_RO=/:/a/a1 LL_FS_RW=/a  /sandboxer mv -v file subdir/file2
Executing the sandboxed command...
mv: cannot move 'file' to 'subdir/file2': Permission denied
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# This is also not allowed, but that's OK since even though technically we're
# actually moving /a/a1/a12/file to /a/a1/a12/subdir/file2, we're not doing it
# through /a (we're walking into a12 via /b/b1, so rules on /a shouldn't
# apply anyway)

/b/b1/a111 [1]# LL_FS_RO=/:/a/a1 LL_FS_RW=/b  /sandboxer mv -v file subdir/file2
Executing the sandboxed command...
renamed 'file' -> 'subdir/file2'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# And this works because we walk from /b/b1 after doing collect_domain_accesses

I think overall this is just a very strange edge case and people should
not rely on the exact behavior whether it's intentional or optimization
side-effect (as long as it deny access / renames when there is no rules at
any of the reasonable upper directories).  Also, since as far as I can
tell this "optimization" only accidentally allows more access (i.e.  rules
anywhere between the bind mountpoint to real root would apply, even if
technically the now disconnected directory belongs outside of the
mountpoint), I think it might be fine to leave it as-is, rather than
potentially complicating this code to deal with this quite unusual edge
case?  (I mean, it's not exactly obvious to me whether it is more correct
to respect rules placed between the original bind mountpoint and root, or
more correct to ignore these rules (i.e. the behaviour of non-refer access
checks))

It is a bit weird that `mv -v file file2` and `mv -v file subdir/file2`
behaves differently tho.

If you would like to fix it, what do you think about my initial idea?:
> This might need more thinking, but maybe if one of the operands is
> disconnected, we can just let it walk until IS_ROOT(dentry), and also
> collect access for the other path until IS_ROOT(dentry), then call
> is_access_to_paths_allowed() passing in the root dentry we walked to?  (In
> this case is_access_to_paths_allowed will not do any walking and just make
> an access decision.)

This will basically make the refer checks behave the same as non-refer
checks on disconnected paths - walk until IS_ROOT, and stop there.