[PATCH v3 bpf-next 1/4] kernfs: remove iattr_mutex

Christian Brauner brauner at kernel.org
Wed Jul 2 12:17:57 UTC 2025


On Wed, Jul 02, 2025 at 11:47:58AM +0100, André Draszik wrote:
> Hi,
> 
> On Sun, 2025-06-22 at 23:38 -0700, Song Liu wrote:
> > From: Christian Brauner <brauner at kernel.org>
> > 
> > All allocations of struct kernfs_iattrs are serialized through a global
> > mutex. Simply do a racy allocation and let the first one win. I bet most
> > callers are under inode->i_rwsem anyway and it wouldn't be needed but
> > let's not require that.
> > 
> > Signed-off-by: Christian Brauner <brauner at kernel.org>
> > Acked-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>
> > Acked-by: Tejun Heo <tj at kernel.org>
> > Signed-off-by: Song Liu <song at kernel.org>
> 
> On next-20250701, ls -lA gives errors on /sys:
> 
> $ ls -lA /sys/
> ls: /sys/: No data available
> ls: /sys/kernel: No data available
> ls: /sys/power: No data available
> ls: /sys/class: No data available
> ls: /sys/devices: No data available
> ls: /sys/dev: No data available
> ls: /sys/hypervisor: No data available
> ls: /sys/fs: No data available
> ls: /sys/bus: No data available
> ls: /sys/firmware: No data available
> ls: /sys/block: No data available
> ls: /sys/module: No data available
> total 0
> drwxr-xr-x   2 root root 0 Jan  1  1970 block
> drwxr-xr-x  52 root root 0 Jan  1  1970 bus
> drwxr-xr-x  88 root root 0 Jan  1  1970 class
> drwxr-xr-x   4 root root 0 Jan  1  1970 dev
> drwxr-xr-x  11 root root 0 Jan  1  1970 devices
> drwxr-xr-x   3 root root 0 Jan  1  1970 firmware
> drwxr-xr-x  10 root root 0 Jan  1  1970 fs
> drwxr-xr-x   2 root root 0 Jul  2 09:43 hypervisor
> drwxr-xr-x  14 root root 0 Jan  1  1970 kernel
> drwxr-xr-x 251 root root 0 Jan  1  1970 module
> drwxr-xr-x   3 root root 0 Jul  2 09:43 power
> 
> 
> and my bisect is pointing to this commit. Simply reverting it also fixes
> the errors.
> 
> 
> Do you have any suggestions?

Yes, apparently the xattr selftest don't cover sysfs/kernfs. The issue
is that the commit changed listxattr() to skip allocation of the xattr
header and instead just returned ENODATA. We should just allocate like
before tested just now:

user1 at localhost:~$ sudo ls -al /sys/kernel/
total 0
drwxr-xr-x  17 root root    0 Jul  2 13:41 .
dr-xr-xr-x  12 root root    0 Jul  2 13:41 ..
-r--r--r--   1 root root 4096 Jul  2 13:41 address_bits
drwxr-xr-x   3 root root    0 Jul  2 13:41 boot_params
drwxr-xr-x   2 root root    0 Jul  2 13:41 btf
drwxr-xr-x   2 root root    0 Jul  2 13:41 cgroup
drwxr-xr-x   2 root root    0 Jul  2 13:41 config
-r--r--r--   1 root root 4096 Jul  2 13:41 cpu_byteorder
-r--r--r--   1 root root 4096 Jul  2 13:41 crash_elfcorehdr_size
drwx------  34 root root    0 Jul  2 13:41 debug
-r--r--r--   1 root root 4096 Jul  2 13:41 fscaps
-r--r--r--   1 root root 4096 Jul  2 13:41 hardlockup_count
drwxr-xr-x   2 root root    0 Jul  2 13:41 iommu_groups
drwxr-xr-x 344 root root    0 Jul  2 13:41 irq
-r--r--r--   1 root root 4096 Jul  2 13:41 kexec_crash_loaded
-rw-r--r--   1 root root 4096 Jul  2 13:41 kexec_crash_size
-r--r--r--   1 root root 4096 Jul  2 13:41 kexec_loaded
drwxr-xr-x   9 root root    0 Jul  2 13:41 mm
-r--r--r--   1 root root   84 Jul  2 13:41 notes
-r--r--r--   1 root root 4096 Jul  2 13:41 oops_count
-rw-r--r--   1 root root 4096 Jul  2 13:41 profiling
-rw-r--r--   1 root root 4096 Jul  2 13:41 rcu_expedited
-rw-r--r--   1 root root 4096 Jul  2 13:41 rcu_normal
-r--r--r--   1 root root 4096 Jul  2 13:41 rcu_stall_count
drwxr-xr-x   2 root root    0 Jul  2 13:41 reboot
drwxr-xr-x   2 root root    0 Jul  2 13:41 sched_ext
drwxr-xr-x   4 root root    0 Jul  2 13:41 security
drwxr-xr-x 190 root root    0 Jul  2 13:41 slab
-r--r--r--   1 root root 4096 Jul  2 13:41 softlockup_count
drwxr-xr-x   2 root root    0 Jul  2 13:41 software_nodes
drwxr-xr-x   4 root root    0 Jul  2 13:41 sunrpc
drwxr-xr-x   6 root root    0 Jul  2 13:41 tracing
-r--r--r--   1 root root 4096 Jul  2 13:41 uevent_seqnum
-r--r--r--   1 root root 4096 Jul  2 13:41 vmcoreinfo
-r--r--r--   1 root root 4096 Jul  2 13:41 warn_count

I'm folding:

diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index 3c293a5a21b1..457f91c412d4 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -142,9 +142,9 @@ ssize_t kernfs_iop_listxattr(struct dentry *dentry, char *buf, size_t size)
        struct kernfs_node *kn = kernfs_dentry_node(dentry);
        struct kernfs_iattrs *attrs;

-       attrs = kernfs_iattrs_noalloc(kn);
+       attrs = kernfs_iattrs(kn);
        if (!attrs)
-               return -ENODATA;
+               return -ENOMEM;

        return simple_xattr_list(d_inode(dentry), &attrs->xattrs, buf, size);
 }

which brings it back to the old behavior.

I'm also adding a selftest for this behavior. Patch appended.


More information about the Linux-security-module-archive mailing list