[PATCH 00/13] VFS: Filesystem information [ver #19]

Thu Mar 19 10:37:37 UTC 2020

Miklos Szeredi <miklos at szeredi.hu> wrote:

> >  (2) It's more efficient as we can return specific binary data rather than
> >      making huge text dumps.  Granted, sysfs and procfs could present the
> >      same data, though as lots of little files which have to be
> >      individually opened, read, closed and parsed.
> 
> Asked this a number of times, but you haven't answered yet:  what
> application would require such a high efficiency?

Low efficiency means more time doing this when that time could be spent doing
other things - or even putting the CPU in a powersaving state.  Using an
open/read/close render-to-text-and-parse interface *will* be slower and less
efficient as there are more things you have to do to use it.

Then consider doing a walk over all the mounts in the case where there are
10000 of them - we have issues with /proc/mounts for such.  fsinfo() will end
up doing a lot less work.

> I strongly feel that mount info belongs in the latter category

I feel strongly that a lot of stuff done through /proc or /sys shouldn't be.

Yes, it's nice that you can explore it with cat and poke it with echo, but it
has a number of problems: security, atomiticity, efficiency and providing an
round-the-back way to pin stuff if not done right.

> >  (3) We wouldn't have the overhead of open and close (even adding a
> >      self-contained readfile() syscall has to do that internally
> 
> Busted: add f_op->readfile() and be done with all that.   For example
> DEFINE_SHOW_ATTRIBUTE() could be trivially moved to that interface.

Look at your example.  "f_op->".  That's "file->f_op->" I presume.

You would have to make it "i_op->" to avoid the open and the close - and for
things like procfs and sysfs, that's probably entirely reasonable - but bear
in mind that you still have to apply all the LSM file security controls, just
in case the backing filesystem is, say, ext4 rather than procfs.

> We could optimize existing proc, sys, etc. interfaces, but it's not
> been an issue, apparently.

You can't get rid of or change many of the existing interfaces.  A lot of them
are effectively indirect system calls and are, as such, part of the fixed
UAPI.  You'd have to add a parallel optimised set.

> >  (6) Don't have to create/delete a bunch of sysfs/procfs nodes each time a
> >      mount happens or is removed - and since systemd makes much use of
> >      mount namespaces and mount propagation, this will create a lot of
> >      nodes.
> 
> Not true.

This may not be true if you roll your own special filesystem.  It *is* true if
you do it in procfs or sysfs.  The files don't exist if you don't create nodes
or attribute tables for them.

> > The argument for doing this through procfs/sysfs/somemagicfs is that
> > someone using a shell can just query the magic files using ordinary text
> > tools, such as cat - and that has merit - but it doesn't solve the
> > query-by-pathname problem.
> >
> > The suggested way around the query-by-pathname problem is to open the
> > target file O_PATH and then look in a magic directory under procfs
> > corresponding to the fd number to see a set of attribute files[*] laid out.
> > Bash, however, can't open by O_PATH or O_NOFOLLOW as things stand...
> 
> Bash doesn't have fsinfo(2) either, so that's not really a good argument.

I never claimed that fsinfo() could be accessed directly from the shell.  For
you proposal, you claimed "immediately usable from all programming languages,
including scripts".

> Implementing a utility to show mount attribute(s) by path is trivial
> for the file based interface, while it would need to be updated for
> each extension of fsinfo(2).   Same goes for libc, language bindings,
> etc.

That's not precisely true.  If you aren't using an extension to an fsinfo()
attribute, you wouldn't need to change anything[*].

If you want to use an extension - *even* through a file based interface - you
*would* have to change your code and your parser.

And, no, extending an fsinfo() attribute would not require any changes to libc
unless libc is using that attribute[*] and wants to access the extension.

[*] I assume that in C/C++ at least, you'd use linux/fsinfo.h rather than some
    libc version.

[*] statfs() could be emulated this way, but I'm not sure what else libc
    specifically is going to look at.  This is more aimed at libmount amongst
    other things.

David