[PATCH v2 0/7] fs/9p: Reuse inode based on path (in addition to qid)
Tingmao Wang
m at maowtm.org
Thu Sep 4 00:04:10 UTC 2025
Hi!
This is the second version of this series. The individual commits
contains changelogs (most of them are in the first patch), but overall,
most significantly cached mode (loose or metadata) is now unchanged, there
is no longer a "don't reuse inodes at all" mode, bug fixes, using the
right functions, basic rename handling, and new documentation.
Thanks in advance for the review effort :)
v1: https://lore.kernel.org/all/cover.1743971855.git.m@maowtm.org/
Background
----------
(This section has basically the same content as the v1 cover letter)
Previously [1], I noticed that when using 9pfs filesystems, the Landlock
LSM is blocking access even for files / directories allowed by rules, and
that this has something to do with 9pfs creating new inodes despite
Landlock holding a reference to the existing one. Because Landlock uses
inodes' in-memory state (i_security) to identify allowed fs
objects/hierarchies, this causes Landlock to partially break on 9pfs, at
least in uncached mode, which is the default:
# mount -t 9p -o trans=virtio test /mnt
# env LL_FS_RO=/etc:/usr:/bin:/lib:/mnt/readme LL_FS_RW= /sandboxer bash
Executing the sandboxed command...
# cat /mnt/readme
cat: /mnt/readme: Permission denied
This, however, works if somebody is holding onto the dentry (and it also
works with cache=loose), as in both cases the inode is reused:
# tail -f /mnt/readme &
[1] 196
# env LL_FS_RO=/etc:/usr:/bin:/lib:/mnt/readme LL_FS_RW= /sandboxer bash
Executing the sandboxed command...
# cat /mnt/readme
aa
It also works on directories if one have a shell that cd into the
directory. Note that this means only certain usage of Landlock are
affected - for example, sandboxing applications that takes a list of files
to allow, landlocks itself, then evecve. On the other hand, this does not
affect applications that opens a file, then Landlocks itself while keeping
the file it needs open.
While the above is a very simple example, this is problematic in
real-world use cases if Landlock is used to sandox applications on system
that has files mounted via 9pfs, or use 9pfs as the root filesystem. In
addition, this also affects fanotify / inotify when using inode mark (for
local access):
root at d8c28a676d72:/# ./fanotify-basic-open /readme & # on virtiofs
[1] 173
root at d8c28a676d72:/# cat readme
aa
FAN_OPEN: File /readme
root at d8c28a676d72:/# mount -t 9p -o trans=virtio test /mnt
root at d8c28a676d72:/# ./fanotify-basic-open /mnt/readme & # on 9pfs
[2] 176
root at d8c28a676d72:/# cat /mnt/readme
aa
root at d8c28a676d72:/#
Same can be demonstrated with inotifywait. The source code for
fanotify-basic-open, adopted from the fanotify man page, is available at
https://fileshare.maowtm.org/9pfs-landlock-fix/20250903/fanotify-basic-open.c [2].
Note that this is not a security bug for Landlock since it can only cause
legitimate access to be denied, but might be a problem for fanotify perm
(although I do recognize that using perm on individual inodes is already
perhaps a bit unreliable?)
It seems that there was an attempt at making 9pfs reuse inodes on uncached
mode as well, based on qid.path, however it was reverted [3] due to issues
with servers that present duplicate qids, for example on a QEMU host that
has multiple filesystems mounted under a single 9pfs export without
multidevs=remap, or in the case of other servers that doesn't necessarily
support remapping qids ([4] and more). I've done some testing on
v6.12-rc4 which has the simplified 9pfs inode code before it was reverted,
and found that Landlock works (however, we of course then have the issue
demonstrated in [3]).
What this series do
-------------------
(Changes since v1: added more reasoning for the ino_path struct)
With the above in mind, I have a proposal for 9pfs to:
1. Reuse inodes even in uncached mode
2. However, reuse them based on qid.path AND the actual pathname, by doing
the appropriate testing in v9fs_test_inode(_dotl)?
The main problem here is how to store the pathname in a sensible way and
tie it to the inode. For now I opted with an array of names acquired with
take_dentry_name_snapshot, which reuses the same memory as the dcache to
store the actual strings, but doesn't tie the lifetime of the dentry with
the inode (I thought about holding a reference to the dentry in the
v9fs_inode, but it seemed like a wrong approach and would cause dentries
to not be evicted/released).
Additional discussions
----------------------
(New section)
>From some QEMU documentation I read [5] it seems like there is a plan to
resolve these kind of problems in a new version of the protocol, by
expanding the qid to include the filesystem identifier of a file on the
host, so maybe this can be disabled after a successful protocol version
check with the host? For now, inodeident=path will be the default for
uncached filesystems, which can be set to 'qid' to instead to reuse based
only on server-provided inode numbers.
This patchset currently uses strncmp to compare paths but this might be
able to be optimized into a hash comparison first (not done yet).
Alternatively the path can be stored more compactly in the form of a
single string with `/` in it (like normal paths). However, we should
normally only need to do this comparison for one pair of filenames, as the
test is only done if qid.path matches in the first place.
This patchset currently does not support enabling path-based inodes in
cached mode. Additional care needs to be taken to ensure we can refresh
an inode that potentially has data cached, but since Dominique is happy
with cached mode behaving as-is (reusing inodes via qid only), this is not
done.
The current implementation will handle client-side renames of a single
file (or empty directory) correctly, but server side renames, or renaming
a non-empty directory (client or server side), will cause the files being
renamed (or files under the renamed directory) to use new inodes (unless
they are renamed back). The decision to not update the children of a
client-renamed directory is purely to reduce the complexity of this patch,
but is in principle possible.
Testing and explanations
------------------------
(New section)
# mount -t 9p -o ... test /mnt
with the following options:
- trans=virtio
- trans=virtio,inodeident=qid
- trans=virtio,cache=loose
# env LL_FS_RO=/etc:/usr:/bin:/lib:/mnt/readme LL_FS_RW= /sandboxer bash
Executing the sandboxed command...
# cat /mnt/readme
hi
^^ landlock works
# mount -t 9p -o trans=virtio test /mnt
# mkdir /mnt/dir
# mv /mnt/readme /mnt/dir/readme
# env LL_FS_RO=/etc:/usr:/bin:/lib:/mnt/dir/readme LL_FS_RW= /sandboxer bash
Executing the sandboxed command...
# cat /mnt/dir/readme
hi
^^ landlock works
# # another terminal in guest: mv /mnt/dir/readme /mnt/dir/readme.2
# cat /mnt/dir/readme.2
hi
^^ ino_path is carried with renames
# # host: mv 9pfs/dir/readme.2 9pfs/dir/readme
# cat /mnt/dir/readme.2
cat: /mnt/dir/readme.2: No such file or directory
# cat /mnt/dir/readme
cat: /mnt/dir/readme: Permission denied
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ we can't track renames on the server side
# # host: mv 9pfs/dir/readme 9pfs/dir/readme.2
# cat /mnt/dir/readme.2
hi
^^ once the file is back at its original place it works as expected.
# # another terminal in guest: mv /mnt/dir/readme.2 /mnt/dir/readme
# cat /mnt/dir/readme
hi
^^ we can track renames of the file directly...
# # another terminal in guest: mv /mnt/dir /mnt/dir.2
# cat /mnt/dir.2/readme
cat: /mnt/dir.2/readme: Permission denied
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ but not renames of the parent directory, even if done client-side
# # another terminal in guest: mv /mnt/dir.2 /mnt/dir
# cat /mnt/dir/readme
hi
^^ works once it's back
# # another terminal in guest: mv /mnt/dir /mnt/dir.2 && mkdir /mnt/dir && echo hi2 > /mnt/dir/readme
# cat /mnt/dir/readme
cat: /mnt/dir/readme: Permission denied
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ a different file uses a different inode even if same path
# # another terminal in guest: mv /mnt/dir.2/readme /mnt/dir/readme
# cat /mnt/dir/readme
hi
# # host: rm 9pfs/dir/readme && echo hi3 > 9pfs/dir/readme
# cat /mnt/dir/readme
cat: /mnt/dir/readme: Permission denied
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ a different file (identified by server-side qid changes) uses different inode
fanotify also works, as tested with the program attached at the end.
In addition, I ran xfstests on a uncached 9pfs mount, and while there are
some test failures, it is the same set of failures as on the current
mainline. Test logs at https://fileshare.maowtm.org/9pfs-landlock-fix/20250903/index.html
Tested also with Mickaël's new v9fs landlock tests [6] (unmerged yet):
# RUN layout3_fs.v9fs.tag_inode_dir_parent ...
# OK layout3_fs.v9fs.tag_inode_dir_parent
ok 129 layout3_fs.v9fs.tag_inode_dir_parent
# RUN layout3_fs.v9fs.tag_inode_dir_mnt ...
# OK layout3_fs.v9fs.tag_inode_dir_mnt
ok 130 layout3_fs.v9fs.tag_inode_dir_mnt
# RUN layout3_fs.v9fs.tag_inode_dir_child ...
# OK layout3_fs.v9fs.tag_inode_dir_child
ok 131 layout3_fs.v9fs.tag_inode_dir_child
# RUN layout3_fs.v9fs.tag_inode_file ...
# OK layout3_fs.v9fs.tag_inode_file
ok 132 layout3_fs.v9fs.tag_inode_file
# RUN layout3_fs.v9fs.release_inodes ...
# OK layout3_fs.v9fs.release_inodes
ok 133 layout3_fs.v9fs.release_inodes
This patch series was based on, and mostly tested on v6.17-rc1 + [7]
Kind regards,
Tingmao
[1]: https://github.com/landlock-lsm/linux/issues/45
[2]: https://fileshare.maowtm.org/9pfs-landlock-fix/20250903/fanotify-basic-open.c
[3]: https://lore.kernel.org/all/20241024-revert_iget-v1-4-4cac63d25f72@codewreck.org/
[4]: https://lore.kernel.org/all/20240923100508.GA32066@willie-the-truck/
[5]: https://wiki.qemu.org/Documentation/9p#Protocol_Plans
[6]: https://lore.kernel.org/all/20250704171345.1393451-1-mic@digikod.net/
[7]: https://lore.kernel.org/all/cover.1743956147.git.m@maowtm.org/
Tingmao Wang (7):
fs/9p: Add ability to identify inode by path for .L in uncached mode
fs/9p: add option for path-based inodes
fs/9p: Add ability to identify inode by path for non-.L in uncached
mode
fs/9p: .L: Refresh stale inodes on reuse
fs/9p: non-.L: Refresh stale inodes on reuse
fs/9p: update the target's ino_path on rename
docs: fs/9p: Document the "inodeident" option
Documentation/filesystems/9p.rst | 42 +++++++
fs/9p/Makefile | 3 +-
fs/9p/ino_path.c | 111 ++++++++++++++++++
fs/9p/v9fs.c | 59 +++++++++-
fs/9p/v9fs.h | 87 ++++++++++----
fs/9p/vfs_inode.c | 195 ++++++++++++++++++++++++++-----
fs/9p/vfs_inode_dotl.c | 171 +++++++++++++++++++++++----
fs/9p/vfs_super.c | 13 ++-
8 files changed, 611 insertions(+), 70 deletions(-)
create mode 100644 fs/9p/ino_path.c
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
prerequisite-patch-id: 3dae487a4b3d676de7c20b269553e3e2176b1e36
prerequisite-patch-id: 93ab54c52a41fa44b8d0baf55df949d0ad27e99a
prerequisite-patch-id: 5f558bf969e6eaa3d011c98de0806ca8ad369efe
--
2.51.0
More information about the Linux-security-module-archive
mailing list