[PATCH bpf-next 3/4] bpf: Introduce path iterator

Mickaël Salaün mic at digikod.net
Sat May 31 08:39:02 UTC 2025


On Fri, May 30, 2025 at 07:43:48PM +0100, Al Viro wrote:
> On Fri, May 30, 2025 at 02:20:39PM +0200, Mickaël Salaün wrote:
> 
> > Without access to mount_lock, what would be the best way to fix this
> > Landlock issue while making it backportable?
> > 
> > > 
> > > If we update path_parent in this patchset with choose_mountpoint(),
> > > and use it in Landlock, we will close this race condition, right?
> > 
> > choose_mountpoint() is currently private, but if we add a new filesystem
> > helper, I think the right approach would be to expose follow_dotdot(),
> > updating its arguments with public types.  This way the intermediates
> > mount points will not be exposed, RCU optimization will be leveraged,
> > and usage of this new helper will be simplified.
> 
> IMO anything that involves struct nameidata should remain inside
> fs/namei.c - something public might share helpers with it, but that's
> it.  We had more than enough pain on changes in there, and I'm pretty
> sure that we are not done yet; in the area around atomic_open, but not
> only there.  Parts of that are still too subtle, IMO - it got a lot
> better over the years, but I would really prefer to avoid the need
> to bring more code into analysis for any further massage.
> 
> Are you sure that follow_dotdot() behaviour is what you really want?
> 
> Note that it's not quite how the pathname resolution works.  There we
> have the result of follow_dotdot() fed to step_into(), and that changes
> things.  Try this:
> 
> mkdir /tmp/foo
> mkdir /tmp/foo/bar
> cd /tmp/foo/bar
> mount -t tmpfs none /tmp/foo
> touch /tmp/foo/x
> ls -Uldi . .. /tmp/foo ../.. /tmp ../x
> 
> and think about the results.  Traversing .. is basically "follow_up as much
> as possible, then to parent, then follow_down as much as possible" and
> the last part (../x) explains why we do it that way.
> 
> Which objects would you want to iterate through when dealing with the
> current directory in the experiment above?  Simulation of pathwalk
> would have the root of overmounting filesystem as the second object
> visited; follow_dotdot() would yield the directory overmounted by
> that instead.
> 
> I'm not saying that either behaviour is right for your case - just that
> they are not identical and it's something that needs to be consciously
> chosen.

Thanks, this helps. I didn't though about this semantic difference.  We
don't want the handle_dots() semantic (which call follow_dotdot() and
step_into()), only the (backward) pathwalk one which is equivalent to
follow_dotdot().  I'll add Landlock tests for this specific scenario.



More information about the Linux-security-module-archive mailing list