[PATCH v2 0/7] fs/9p: Reuse inode based on path (in addition to qid)

Tingmao Wang m at maowtm.org
Tue Sep 16 23:59:21 UTC 2025


On 9/16/25 20:22, Christian Schoenebeck wrote:
> On Tuesday, September 16, 2025 4:01:40 PM CEST Tingmao Wang wrote:
>> On 9/16/25 14:35, Dominique Martinet wrote:
>>> Tingmao Wang wrote on Tue, Sep 16, 2025 at 01:44:27PM +0100:
>>>> [...]
>>>>
>>>> Note that in discussion with Mickaël (maintainer of Landlock) he
>>>> indicated
>>>> that he would be comfortable for Landlock to track a qid, instead of
>>>> holding a inode, specifically for 9pfs.
>>>
>>> Yes, I saw that, but what you pointed out about qid reuse make me
>>> somewhat uncomfortable with that direction -- you could allow a
>>> directory, delete it, create a new one somewhere else and if the
>>> underlying fs reuse the same inode number the rule would allow an
>>> intended directory instead so I'd rather not rely on qid for this
>>> either.
>>> But if you think that's not a problem in practice (because e.g. landlock
>>> would somehow detect the dir got deleted or another good reason it's not
>>> a problem) then I agree it's probably the simplest way forward
>>> implementation-wise.
>>
>> Sorry, I forgot to add that this idea would also involve Landlock holding
>> a reference to the fid (or dentry, but that's problematic due to breaking
>> unmount unless we can have a new hook) to keep the file open on the host
>> side so that the qid won't be reused (ignoring collisions caused by
>> different filesystems mounted under one 9pfs export when multidev mapping
>> is not enabled)
> 
> I see that you are proposing an option for your proposed qid based re-using of 
> dentries. I don't think it should be on by default though, considering what we 
> already discussed (e.g. inodes recycled by ext4, but also not all 9p servers 
> handling inode collisions).

Just to be clear, this approach (Landlock holding a fid reference, then
using the qid as a key to search for rules when a Landlocked process
accesses the previously remembered file, possibly after the file has been
moved on the server) would only be in Landlock, and would only affect
Landlock, not 9pfs (so not sure what you meant by "re-using of dentries").

The idea behind holding a fid reference within Landlock is that, because
we have the file open, the inode would not get recycled in ext4, and thus
no other file will reuse the qid, until we close that reference (when the
Landlock domain terminates, or when the 9p filesystem is unmounted)

> 
>> (There's the separate issue of QEMU not seemingly keeping a directory open
>> on the host when the guest has a fid to it tho.  I checked that if the dir
>> is renamed on the host side, any process in the guest that has a fd to it
>> (checked via cd in a shell) will not be able to use that fd to read it
>> anymore.  This also means that another directory might be created with the
>> same qid.path)
> 
> For all open FIDs QEMU retains a descriptor to the file/directory.
> 
> Which 9p message do you see sent to server, Trename or Trenameat?
> 
> Does this always happen to you or just sometimes, i.e. under heavy load? 

Always happen, see log: (no Trename since the rename is done on the host)

    qemu flags: -virtfs "local,path=/tmp/test,mount_tag=test,security_model=passthrough,readonly=off,multidevs=remap"
    qemu version: QEMU emulator version 10.1.0 (Debian 1:10.1.0+ds-5)
    guest kernel version: 6.17.0-rc5 (for the avoidance of doubt, this is clean 6.17-rc5 with no patches)
    qemu pid: 511476

    root at host # ls -la /proc/511476/fd | grep test
    lr-x------ 1 root root 64 Sep 17 00:35 41 -> /tmp/test

    root at guest # mount --mkdir -t 9p -o trans=virtio,cache=none,inodeident=qid,debug=13 test /tmp/test
    root at guest # mkdir /tmp/test/dir1
    root at guest # cd /tmp/test/dir1
     9pnet: -- v9fs_vfs_getattr_dotl (183): dentry: ffff888102ed4d38
     9pnet: -- v9fs_fid_find (183):  dentry: / (ffff888102ed4d38) uid 0 any 0
     9pnet: (00000183) >>> TGETATTR fid 1, request_mask 16383
     9pnet: (00000183) >>> size=19 type: 24 tag: 0
     9pnet: (00000183) <<< size=160 type: 25 tag: 0
     9pnet: (00000183) <<< RGETATTR st_result_mask=6143
     <<< qid=80.3.68c9c8a3
     <<< st_mode=000043ff st_nlink=3
     <<< st_uid=0 st_gid=0
     <<< st_rdev=0 st_size=3c st_blksize=131072 st_blocks=0
     <<< st_atime_sec=1758065706 st_atime_nsec=857221735
     <<< st_mtime_sec=1758065827 st_mtime_nsec=745359877
     <<< st_ctime_sec=1758065827 st_ctime_nsec=745359877
     <<< st_btime_sec=0 st_btime_nsec=0
     <<< st_gen=0 st_data_version=0
     9pnet: -- v9fs_vfs_lookup (183): dir: ffff8881090e0000 dentry: (dir1) ffff888102e458f8 flags: 1
     9pnet: -- v9fs_fid_find (183):  dentry: / (ffff888102ed4d38) uid 0 any 0
     9pnet: (00000183) >>> TWALK fids 1,2 nwname 1d wname[0] dir1
     9pnet: (00000183) >>> size=23 type: 110 tag: 0
     9pnet: (00000183) <<< size=22 type: 111 tag: 0
     9pnet: (00000183) <<< RWALK nwqid 1:
     9pnet: (00000183) <<<     [0] 80.5.68c9dca3
     9pnet: (00000183) >>> TGETATTR fid 2, request_mask 6143
     9pnet: (00000183) >>> size=19 type: 24 tag: 0
     9pnet: (00000183) <<< size=160 type: 25 tag: 0
     9pnet: (00000183) <<< RGETATTR st_result_mask=6143
     <<< qid=80.5.68c9dca3
     <<< st_mode=000041ed st_nlink=2
     <<< st_uid=0 st_gid=0
     <<< st_rdev=0 st_size=28 st_blksize=131072 st_blocks=0
     <<< st_atime_sec=1758065827 st_atime_nsec=745359877
     <<< st_mtime_sec=1758065827 st_mtime_nsec=745359877
     <<< st_ctime_sec=1758065827 st_ctime_nsec=749830521
     <<< st_btime_sec=0 st_btime_nsec=0
     <<< st_gen=0 st_data_version=0
     9pnet: -- v9fs_vfs_getattr_dotl (183): dentry: ffff888102e458f8
     9pnet: -- v9fs_fid_find (183):  dentry: dir1 (ffff888102e458f8) uid 0 any 0
     9pnet: (00000183) >>> TGETATTR fid 2, request_mask 16383
     9pnet: (00000183) >>> size=19 type: 24 tag: 0
     9pnet: (00000183) <<< size=160 type: 25 tag: 0
     9pnet: (00000183) <<< RGETATTR st_result_mask=6143
     <<< qid=80.5.68c9dca3
     <<< st_mode=000041ed st_nlink=2
     <<< st_uid=0 st_gid=0
     <<< st_rdev=0 st_size=28 st_blksize=131072 st_blocks=0
     <<< st_atime_sec=1758065827 st_atime_nsec=745359877
     <<< st_mtime_sec=1758065827 st_mtime_nsec=745359877
     <<< st_ctime_sec=1758065827 st_ctime_nsec=749830521
     <<< st_btime_sec=0 st_btime_nsec=0
     <<< st_gen=0 st_data_version=0
     9pnet: -- v9fs_dentry_release (183):  dentry: dir1 (ffff888102e458f8)
     9pnet: (00000183) >>> TCLUNK fid 2 (try 0)
     9pnet: (00000183) >>> size=11 type: 120 tag: 0
     9pnet: (00000183) <<< size=7 type: 121 tag: 0
     9pnet: (00000183) <<< RCLUNK fid 2
     9pnet: -- v9fs_vfs_lookup (183): dir: ffff8881090e0000 dentry: (dir1) ffff888102e45a70 flags: 3
     9pnet: -- v9fs_fid_find (183):  dentry: / (ffff888102ed4d38) uid 0 any 0
     9pnet: (00000183) >>> TWALK fids 1,2 nwname 1d wname[0] dir1
     9pnet: (00000183) >>> size=23 type: 110 tag: 0
     9pnet: (00000183) <<< size=22 type: 111 tag: 0
     9pnet: (00000183) <<< RWALK nwqid 1:
     9pnet: (00000183) <<<     [0] 80.5.68c9dca3
     9pnet: (00000183) >>> TGETATTR fid 2, request_mask 6143
     9pnet: (00000183) >>> size=19 type: 24 tag: 0
     9pnet: (00000183) <<< size=160 type: 25 tag: 0
     9pnet: (00000183) <<< RGETATTR st_result_mask=6143
     <<< qid=80.5.68c9dca3
     <<< st_mode=000041ed st_nlink=2
     <<< st_uid=0 st_gid=0
     <<< st_rdev=0 st_size=28 st_blksize=131072 st_blocks=0
     <<< st_atime_sec=1758065827 st_atime_nsec=745359877
     <<< st_mtime_sec=1758065827 st_mtime_nsec=745359877
     <<< st_ctime_sec=1758065827 st_ctime_nsec=749830521
     <<< st_btime_sec=0 st_btime_nsec=0
     <<< st_gen=0 st_data_version=0

     (fid 2 is now a persistent handle pointing to /dir1, not sure why the
     walk was done twice)

    root at host # ls -la /proc/511476/fd | grep test
    lr-x------ 1 root root 64 Sep 17 00:35 41 -> /tmp/test
    (no fd points to dir1)

    root at host # mv -v /tmp/test/dir1 /tmp/test/dir2
    renamed '/tmp/test/dir1' -> '/tmp/test/dir2'

    root at guest:/tmp/test/dir1# ls
     9pnet: -- v9fs_vfs_getattr_dotl (183): dentry: ffff888102e45a70
     9pnet: -- v9fs_fid_find (183):  dentry: dir1 (ffff888102e45a70) uid 0 any 0
     9pnet: (00000183) >>> TGETATTR fid 2, request_mask 16383
     9pnet: (00000183) >>> size=19 type: 24 tag: 0
     9pnet: (00000183) <<< size=11 type: 7 tag: 0
     9pnet: (00000183) <<< RLERROR (-2)
     9pnet: -- v9fs_file_open (188): inode: ffff888102e80640 file: ffff88810af45340
     9pnet: -- v9fs_fid_find (188):  dentry: dir1 (ffff888102e45a70) uid 0 any 0
     9pnet: (00000188) >>> TWALK fids 2,3 nwname 0d wname[0] (null)
     9pnet: (00000188) >>> size=17 type: 110 tag: 0
     9pnet: (00000188) <<< size=11 type: 7 tag: 0
     9pnet: (00000188) <<< RLERROR (-2)
    ls: cannot open directory '.': No such file or directory

It looks like as soon as the directory was moved on the host, TGETATTR on
the guest-opened fid 2 fails, even though I would expect that if QEMU
opens a fd to the dir and use that fd whenever fid 2 is used, that
TGETATTR should succeed.  The fact that I can't see anything pointing to
dir1 in /proc/511476/fd was also suspicious.

Also, if I remove the dir on the host, then repoen it in the guest, ls
starts working again:

    root at host # mv -v /tmp/test/dir2 /tmp/test/dir1
    renamed '/tmp/test/dir2' -> '/tmp/test/dir1'

    root at guest:/tmp/test/dir1# ls
     9pnet: -- v9fs_file_open (189): inode: ffff888102e80640 file: ffff88810af47100
     9pnet: -- v9fs_fid_find (189):  dentry: dir1 (ffff888102e45a70) uid 0 any 0
     9pnet: (00000189) >>> TWALK fids 2,3 nwname 0d wname[0] (null)
     9pnet: (00000189) >>> size=17 type: 110 tag: 0
     9pnet: (00000189) <<< size=9 type: 111 tag: 0
     9pnet: (00000189) <<< RWALK nwqid 0:
     9pnet: (00000189) >>> TLOPEN fid 3 mode 100352
     9pnet: (00000189) >>> size=15 type: 12 tag: 0
     9pnet: (00000189) <<< size=24 type: 13 tag: 0
     9pnet: (00000189) <<< RLOPEN qid 80.5.68c9dca3 iounit 0
     9pnet: -- v9fs_vfs_getattr_dotl (189): dentry: ffff888102e45a70
     9pnet: -- v9fs_fid_find (189):  dentry: dir1 (ffff888102e45a70) uid 0 any 0
     9pnet: (00000189) >>> TGETATTR fid 2, request_mask 16383
     9pnet: (00000189) >>> size=19 type: 24 tag: 0
     9pnet: (00000189) <<< size=160 type: 25 tag: 0
     9pnet: (00000189) <<< RGETATTR st_result_mask=6143
     <<< qid=80.5.68c9dca3
     <<< st_mode=000041ed st_nlink=2
     <<< st_uid=0 st_gid=0
     <<< st_rdev=0 st_size=28 st_blksize=131072 st_blocks=0
     <<< st_atime_sec=1758065827 st_atime_nsec=745359877
     <<< st_mtime_sec=1758065827 st_mtime_nsec=745359877
     <<< st_ctime_sec=1758066075 st_ctime_nsec=497687251
     <<< st_btime_sec=0 st_btime_nsec=0
     <<< st_gen=0 st_data_version=0
     9pnet: -- v9fs_dir_readdir_dotl (189): name dir1
     9pnet: (00000189) >>> TREADDIR fid 3 offset 0 count 131072
     9pnet: (00000189) >>> size=23 type: 40 tag: 0
     9pnet: (00000189) <<< size=62 type: 41 tag: 0
     9pnet: (00000189) <<< RREADDIR count 51
     9pnet: (00000189) >>> TREADDIR fid 3 offset 2147483647 count 131072
     9pnet: (00000189) >>> size=23 type: 40 tag: 0
     9pnet: (00000189) <<< size=11 type: 41 tag: 0
     9pnet: (00000189) <<< RREADDIR count 0
     9pnet: -- v9fs_dir_readdir_dotl (189): name dir1
     9pnet: (00000189) >>> TREADDIR fid 3 offset 2147483647 count 131072
     9pnet: (00000189) >>> size=23 type: 40 tag: 0
     9pnet: (00000189) <<< size=11 type: 41 tag: 0
     9pnet: (00000189) <<< RREADDIR count 0
     9pnet: -- v9fs_dir_release (189): inode: ffff888102e80640 filp: ffff88810af47100 fid: 3
     9pnet: (00000189) >>> TCLUNK fid 3 (try 0)
     9pnet: (00000189) >>> size=11 type: 120 tag: 0
     9pnet: (00000189) <<< size=7 type: 121 tag
    root at guest:/tmp/test/dir1# echo $?
    0

Somehow if I rename in the guest, it all works, even though it's using the
same fid 2 (and it didn't ask QEMU to walk the new path)

    root at guest:/tmp/test/dir1# mv /tmp/test/dir1 /tmp/test/dir2
     9pnet: -- v9fs_vfs_getattr_dotl (183): dentry: ffff888102e45a70
     9pnet: -- v9fs_fid_find (183):  dentry: dir1 (ffff888102e45a70) uid 0 any 0
     9pnet: (00000183) >>> TGETATTR fid 2, request_mask 16383
     9pnet: (00000183) >>> size=19 type: 24 tag: 0
     9pnet: (00000183) <<< size=160 type: 25 tag: 0
     9pnet: (00000183) <<< RGETATTR st_result_mask=6143
     <<< qid=80.5.68c9dca3
     <<< st_mode=000041ed st_nlink=2
     <<< st_uid=0 st_gid=0
     <<< st_rdev=0 st_size=28 st_blksize=131072 st_blocks=0
     <<< st_atime_sec=1758066561 st_atime_nsec=442431580
     <<< st_mtime_sec=1758065827 st_mtime_nsec=745359877
     <<< st_ctime_sec=1758066559 st_ctime_nsec=570428555
     <<< st_btime_sec=0 st_btime_nsec=0
     <<< st_gen=0 st_data_version=0
     9pnet: -- v9fs_vfs_lookup (194): dir: ffff8881090e0000 dentry: (dir2) ffff888102edca48 flags: e0000
     9pnet: -- v9fs_fid_find (194):  dentry: / (ffff888102ed4d38) uid 0 any 0
     9pnet: (00000194) >>> TWALK fids 1,3 nwname 1d wname[0] dir2
     9pnet: (00000194) >>> size=23 type: 110 tag: 0
     9pnet: (00000194) <<< size=11 type: 7 tag: 0
     9pnet: (00000194) <<< RLERROR (-2)
     9pnet: -- v9fs_dentry_release (194):  dentry: dir2 (ffff888102edca48)
     9pnet: -- v9fs_vfs_lookup (194): dir: ffff8881090e0000 dentry: (dir2) ffff888102edcbc0 flags: 0
     9pnet: -- v9fs_fid_find (194):  dentry: / (ffff888102ed4d38) uid 0 any 0
     9pnet: (00000194) >>> TWALK fids 1,3 nwname 1d wname[0] dir2
     9pnet: (00000194) >>> size=23 type: 110 tag: 0
     9pnet: (00000194) <<< size=11 type: 7 tag: 0
     9pnet: (00000194) <<< RLERROR (-2)
     9pnet: -- v9fs_dentry_release (194):  dentry: dir2 (ffff888102edcbc0)
     9pnet: -- v9fs_vfs_lookup (194): dir: ffff8881090e0000 dentry: (dir2) ffff888102edcd38 flags: a0000
     9pnet: -- v9fs_fid_find (194):  dentry: / (ffff888102ed4d38) uid 0 any 0
     9pnet: (00000194) >>> TWALK fids 1,3 nwname 1d wname[0] dir2
     9pnet: (00000194) >>> size=23 type: 110 tag: 0
     9pnet: (00000194) <<< size=11 type: 7 tag: 0
     9pnet: (00000194) <<< RLERROR (-2)
     9pnet: -- v9fs_vfs_rename (194): 
     9pnet: -- v9fs_fid_find (194):  dentry: dir1 (ffff888102e45a70) uid 0 any 0
     9pnet: -- v9fs_fid_find (194):  dentry: / (ffff888102ed4d38) uid 0 any 0
     9pnet: (00000194) >>> TWALK fids 1,3 nwname 0d wname[0] (null)
     9pnet: (00000194) >>> size=17 type: 110 tag: 0
     9pnet: (00000194) <<< size=9 type: 111 tag: 0
     9pnet: (00000194) <<< RWALK nwqid 0:
     9pnet: -- v9fs_fid_find (194):  dentry: / (ffff888102ed4d38) uid 0 any 0
     9pnet: (00000194) >>> TWALK fids 1,4 nwname 0d wname[0] (null)
     9pnet: (00000194) >>> size=17 type: 110 tag: 0
     9pnet: (00000194) <<< size=9 type: 111 tag: 0
     9pnet: (00000194) <<< RWALK nwqid 0:
     9pnet: (00000194) >>> TRENAMEAT olddirfid 3 old name dir1 newdirfid 4 new name dir2
     9pnet: (00000194) >>> size=27 type: 74 tag: 0
     9pnet: (00000194) <<< size=7 type: 75 tag: 0
     9pnet: (00000194) <<< RRENAMEAT newdirfid 4 new name dir2
     9pnet: (00000194) >>> TCLUNK fid 4 (try 0)
     9pnet: (00000194) >>> size=11 type: 120 tag: 0
     9pnet: (00000194) <<< size=7 type: 121 tag: 0
     9pnet: (00000194) <<< RCLUNK fid 4
     9pnet: (00000194) >>> TCLUNK fid 3 (try 0)
     9pnet: (00000194) >>> size=11 type: 120 tag: 0
     9pnet: (00000194) <<< size=7 type: 121 tag: 0
     9pnet: (00000194) <<< RCLUNK fid 3
     9pnet: -- v9fs_dentry_release (194):  dentry: dir2 (ffff888102edcd38)
    root at guest:/tmp/test/dir1# ls
     9pnet: -- v9fs_file_open (195): inode: ffff888102e80640 file: ffff88810b2b1500
     9pnet: -- v9fs_fid_find (195):  dentry: dir2 (ffff888102e45a70) uid 0 any 0
     9pnet: (00000195) >>> TWALK fids 2,3 nwname 0d wname[0] (null)
     9pnet: (00000195) >>> size=17 type: 110 tag: 0
     9pnet: (00000195) <<< size=9 type: 111 tag: 0
     9pnet: (00000195) <<< RWALK nwqid 0:
     9pnet: (00000195) >>> TLOPEN fid 3 mode 100352
     9pnet: (00000195) >>> size=15 type: 12 tag: 0
     9pnet: (00000195) <<< size=24 type: 13 tag: 0
     9pnet: (00000195) <<< RLOPEN qid 80.5.68c9dca3 iounit 0
     9pnet: -- v9fs_vfs_getattr_dotl (195): dentry: ffff888102e45a70
     9pnet: -- v9fs_fid_find (195):  dentry: dir2 (ffff888102e45a70) uid 0 any 0
     9pnet: (00000195) >>> TGETATTR fid 2, request_mask 16383
     9pnet: (00000195) >>> size=19 type: 24 tag: 0
     9pnet: (00000195) <<< size=160 type: 25 tag: 0
     9pnet: (00000195) <<< RGETATTR st_result_mask=6143
     <<< qid=80.5.68c9dca3
     <<< st_mode=000041ed st_nlink=2
     <<< st_uid=0 st_gid=0
     <<< st_rdev=0 st_size=28 st_blksize=131072 st_blocks=0
     <<< st_atime_sec=1758066561 st_atime_nsec=442431580
     <<< st_mtime_sec=1758065827 st_mtime_nsec=745359877
     <<< st_ctime_sec=1758066568 st_ctime_nsec=562443096
     <<< st_btime_sec=0 st_btime_nsec=0
     <<< st_gen=0 st_data_version=0
     9pnet: -- v9fs_dir_readdir_dotl (195): name dir2
     9pnet: (00000195) >>> TREADDIR fid 3 offset 0 count 131072
     9pnet: (00000195) >>> size=23 type: 40 tag: 0
     9pnet: (00000195) <<< size=62 type: 41 tag: 0
     9pnet: (00000195) <<< RREADDIR count 51
     9pnet: (00000195) >>> TREADDIR fid 3 offset 2147483647 count 131072
     9pnet: (00000195) >>> size=23 type: 40 tag: 0
     9pnet: (00000195) <<< size=11 type: 41 tag: 0
     9pnet: (00000195) <<< RREADDIR count 0
     9pnet: -- v9fs_dir_readdir_dotl (195): name dir2
     9pnet: (00000195) >>> TREADDIR fid 3 offset 2147483647 count 131072
     9pnet: (00000195) >>> size=23 type: 40 tag: 0
     9pnet: (00000195) <<< size=11 type: 41 tag: 0
     9pnet: (00000195) <<< RREADDIR count 0
     9pnet: -- v9fs_dir_release (195): inode: ffff888102e80640 filp: ffff88810b2b1500 fid: 3
     9pnet: (00000195) >>> TCLUNK fid 3 (try 0)
     9pnet: (00000195) >>> size=11 type: 120 tag: 0
     9pnet: (00000195) <<< size=7 type: 121 tag: 0
     9pnet: (00000195) <<< RCLUNK fid 3

If this is surprising, I'm happy to take a deeper look over the weekend
(but I've never tried to debug QEMU itself :D)

> Because even though QEMU retains descriptors of open FIDs; when the QEMU 
> process approaches host system's max. allowed number of open file descriptors 
> then v9fs_reclaim_fd() [hw/9pfs/9p.c] is called, which closes some descriptors 
> of older FIDs to (at least) keep the QEMU process alive.
> 
> BTW: to prevent these descriptor reclaims to happen too often, I plan to do 
> what many other files servers do: asking the host system on process start to 
> increase the max. number of file descriptors.

Note that the above is reproduced with only 1 file open (the dir being
renamed around)

Kind regards,
Tingmao

> 
> /Christian
> 
> 



More information about the Linux-security-module-archive mailing list