[PATCH 00/13] VFS: Filesystem information [ver #19]

Miklos Szeredi miklos at szeredi.hu
Wed Mar 18 16:05:50 UTC 2020

On Wed, Mar 18, 2020 at 4:08 PM David Howells <dhowells at redhat.com> wrote:

> ============================
> ============================
> Why is it better to go with a new system call rather than adding more magic
> stuff to /proc or /sysfs for each superblock object and each mount object?
>  (1) It can be targetted.  It makes it easy to query directly by path.
>      procfs and sysfs cannot do this easily.
>  (2) It's more efficient as we can return specific binary data rather than
>      making huge text dumps.  Granted, sysfs and procfs could present the
>      same data, though as lots of little files which have to be
>      individually opened, read, closed and parsed.

Asked this a number of times, but you haven't answered yet:  what
application would require such a high efficiency?

Nobody's suggesting we move stat(2) to proc interfaces, and AFAIK
nobody suggested we move /proc/PID/* to a binary syscall interface.
Each one has its place, and I strongly feel that mount info belongs in
the latter category.    Feel free to prove the opposite.

>  (3) We wouldn't have the overhead of open and close (even adding a
>      self-contained readfile() syscall has to do that internally

Busted: add f_op->readfile() and be done with all that.   For example
DEFINE_SHOW_ATTRIBUTE() could be trivially moved to that interface.

We could optimize existing proc, sys, etc. interfaces, but it's not
been an issue, apparently.

>  (4) Opening a file in procfs or sysfs has a pathwalk overhead for each
>      file accessed.  We can use an integer attribute ID instead (yes, this
>      is similar to ioctl) - but could also use a string ID if that is
>      preferred.
>  (5) Can easily query cross-namespace if, say, a container manager process
>      is given an fs_context that hasn't yet been mounted into a namespace -
>      or hasn't even been fully created yet.

Works with my patch.

>  (6) Don't have to create/delete a bunch of sysfs/procfs nodes each time a
>      mount happens or is removed - and since systemd makes much use of
>      mount namespaces and mount propagation, this will create a lot of
>      nodes.

Not true.

> The argument for doing this through procfs/sysfs/somemagicfs is that
> someone using a shell can just query the magic files using ordinary text
> tools, such as cat - and that has merit - but it doesn't solve the
> query-by-pathname problem.
> The suggested way around the query-by-pathname problem is to open the
> target file O_PATH and then look in a magic directory under procfs
> corresponding to the fd number to see a set of attribute files[*] laid out.
> Bash, however, can't open by O_PATH or O_NOFOLLOW as things stand...

Bash doesn't have fsinfo(2) either, so that's not really a good argument.

Implementing a utility to show mount attribute(s) by path is trivial
for the file based interface, while it would need to be updated for
each extension of fsinfo(2).   Same goes for libc, language bindings,


More information about the Linux-security-module-archive mailing list