[PATCH v2 0/7] fs/9p: Reuse inode based on path (in addition to qid)
Tingmao Wang
m at maowtm.org
Sun Sep 14 21:25:02 UTC 2025
Hi Dominique and others,
I had a chat with Mickaël earlier this week and some discussion following
that, and we thought of a potential alternative to what I was proposing
here that might work for Landlock: using the inode number (or more
correctly, qid.path) directly as the keys for Landlock rules when
accessing 9p files. I'm not sure how sound this is from the perspective
of 9pfs (there are pros and caveats), and I would like to gather some
thoughts on this idea.
Technically a 9pfs server is not supposed to return colliding qid.paths
for different files. In fact, according to [1], the qid must not be the
same even for files which are deleted then recreated using the same name
(whereas for e.g. ext4, inode number is reused if a file is deleted and
recreated, possibly with a different name, in the same directory).
However, this is in practice not the case for many actual 9pfs server
implementations (thus the reason for this patch series in the first
place).
This is a bad problem for the 9pfs client in Linux as it can lead to data
corruption if the wrong inode is used, but for Landlock, the only effect
of this is allowing access to more files then the sandboxing application
intended (and only in the presence of an "erroneous" 9pfs server). Any
other alternative, including this patch series, has the opposite risk -
files that should be allowed might be denied (even if the server
implementation is fully correct in terms of no reusing of qids). In
particular, this patch cannot correctly handle server-side renames of an
allowed file, or rename of a directory with children in it from the client
(although this might be solved, with the expense of adding more
complicated code in the rename path to rewrite all the struct ino_paths).
In discussion with Mickaël he thought that it would be acceptable for
Landlock to assume that the server is well-behaved, and Landlock could
specialize for 9pfs to allow access if the qid matches what's previously
seen when creating the Landlock ruleset (by using the qid as the key of
the rule, instead of a pointer to the inode).
There are, however, several immediate issues with this approach:
1. The qid is 9pfs internal data, and we may need extra API for 9pfs to
expose this to Landlock. On 64bit, this is easy as it's just the inode
number (offset by 2), which we can already get from the struct inode.
But perhaps on 32bit we need a way to expose the full 64bit server-sent
qid to Landlock (or other kernel subsystems), if we're going to do
this.
2. Even though qids are supposed to be unique across the lifetime of a
filesystem (including deleted files), this is not the case even for
QEMU in multidevs=remap mode, when running on ext4, as tested on QEMU
10.1.0. And thus in practice a Landlock ruleset would need to hold a
reference to the file to keep it open, so that the server will not
re-use the qid for other files (having a reference to the struct inode
alone doesn't seem to do that).
Unfortunately, holding a dentry in Landlock prevents the filesystem
from being unmounted (causes WARNs), with no (proper) chance for
Landlock to release those dentries. We might do it in
security_sb_umount, but then at that point it is not guaranteed that
the unmount will happen - perhaps we would need a new security_ hooks
in the umount path?
Alternatively, I think if we could somehow tell 9pfs to keep a fid open
(until either the Landlock domain is closed, or the filesystem is
unmounted), it could also work.
I'm not sure what's the best way to do this, it seems like unless we
can get a new pre_umount / pre_sb_delete hook in which we can free
dentries, 9pfs would need to expose some new API, or alternatively, in
uncached mode, have the v9fs inode itself hold a (strong) reference to
the fid, so that if Landlock has a reference to the inode, the file is
kept open server-side.
The advantage of doing this is that, for a server with reasonable
behaviour, Landlock users would not get incorrect denials (i.e. things
"just work"), while still maintaining security if the 9p server is
"reasonable" (in particular, an application sandboxed under Landlock would
not get access to unrelated files if it does not have a way to somehow get
those files to be recreated with an allowed inode number), whereas the
current patch has the problem with server side renames and directory
renames (server or client side), and also can't deal with hard links.
I'm not sure how attractive this solution is to various people here -
Mickaël is happy with special-casing 9pfs in Landlock, and in fact he
suggested this idea in the first place, but I think this has the potential
to be quite complicated (but technically more correct). It would also
only work for Landlock, and if e.g. fsnotify wants to have the same
behaviour, that would need its own changes too.
Apologies for the long-winded explanation, any thoughts on this?
[1]: https://ericvh.github.io/9p-rfc/rfc9p2000.html#msgs
"If a file is deleted and recreated with the same name in the same
directory, the old and new path components of the qids should be
different."
---
Note: Even with the above, there's another potential problem - QEMU does
not, for some reason (I've not really investigated this very deeply, but
it's observation from /proc/.../fd), keep a directory open when the guest
has a fid to it. This means that if a directory is deleted while we have
an active Landlock rule on it, a new file or directory may get the same
qid. (However, at least this still correctly handles directory renames,
and the only effect is Landlock allowing more files than intended in the
presence of a buggy server.)
(The Hyper-V 9p server, used by WSL, seems to have the same problem, and a
bit worse since even client-side renames breaks opened dir fds on the
WSL-to-Windows 9pfs (/mnt/c/...))
(Another challenge is that Landlock would have to know when a file is on a
9pfs in uncached mode - we probably don't need this behaviour for cached
mode filesystems, as we assume no server changes in that case and the
inode is reused already. We can certainly determine the FS of a file, but
not sure about specific 9pfs cache options)
More information about the Linux-security-module-archive
mailing list