[PATCH v5 bpf-next 0/5] bpf path iterator
NeilBrown
neil at brown.name
Wed Jul 9 22:24:14 UTC 2025
On Thu, 10 Jul 2025, Song Liu wrote:
>
>
> > On Jul 9, 2025, at 9:06 AM, Mickaël Salaün <mic at digikod.net> wrote:\
>
> [...]
>
> >> If necessary, we hide “root" inside @data. This is good.
> >>
> >>> @path would be updated with latest ancestor path (e.g. @root).
> >>
> >> Update @path to the last ancestor and hold proper references.
> >> I missed this part earlier. With this feature, vfs_walk_ancestors
> >> should work usable with open-codeed bpf path iterator.
> >>
> >> I have a question about this behavior with RCU walk. IIUC, RCU
> >> walk does not hold reference to @ancestor when calling walk_cb().
> >
> > I think a reference to the mount should be held, but not necessarily to
> > the dentry if we are still in the same mount as the original path.
>
> If we update @path and do path_put() after the walk, we have to hold
> reference to both the mnt and the dentry, no?
>
> >
> >> If walk_cb() returns false, shall vfs_walk_ancestors() then
> >> grab a reference on @ancestor? This feels a bit weird to me.
> >
> > If walk_cb() checks for a root, it will return false when the path will
> > match, and the caller would expect to get this root path, right?
>
> If the user want to walk to the global root, walk_cb() may not
> return false at all, IIUC. walk_cb() may also return false on
> other conditions.
>
> >
> > In general, it's safer to always have the same behavior when holding or
> > releasing a reference. I think the caller should then always call
> > path_put() after vfs_walk_ancestors() whatever the return code is.
> >
> >> Maybe “updating @path to the last ancestor” should only apply to
> >> LOOKUP_RCU==false case?
> >>
> >>> @flags could contain LOOKUP_RCU or not, which enables us to have
> >>> walk_cb() not-RCU compatible.
> >>>
> >>> When passing LOOKUP_RCU, if the first call to vfs_walk_ancestors()
> >>> failed with -ECHILD, the caller can restart the walk by calling
> >>> vfs_walk_ancestors() again but without LOOKUP_RCU.
> >>
> >>
> >> Given we want callers to handle -ECHILD and call vfs_walk_ancestors
> >> again without LOOKUP_RCU, I think we should keep @path not changed
> >> With LOOKUP_RCU==true, and only update it to the last ancestor
> >> when LOOKUP_RCU==false.
> >
> > As Neil said, we don't want to explicitly pass LOOKUP_RCU as a public
> > flag. Instead, walk_cb() should never sleep (and then potentially be
> > called under RCU by the vfs_walk_ancestors() implementation).
>
> How should the user handle -ECHILD without LOOKUP_RCU flag? Say the
> following code in landlocked:
>
> /* Try RCU walk first */
> err = vfs_walk_ancestors(path, ll_cb, data, LOOKUP_RCU);
>
> if (err == -ECHILD) {
> struct path walk_path = *path;
>
> /* reset any data changed by the walk */
> reset_data(data);
>
> /* now do ref-walk */
> err = vfs_walk_ancestors(&walk_path, ll_cb, data, 0);
> }
>
> Or do you mean vfs_walk_ancestors will never return -ECHILD?
> Then we need vfs_walk_ancestors to call reset_data logic, right?
It isn't clear to me that vfs_walk_ancestors() needs to return anything.
All the communication happens through walk_cb()
walk_cb() is called with a path, the data, and a "may_sleep" flag.
If it needs to sleep but may_sleep is not set, it returns "-ECHILD"
which causes the walk to restart and use refcounts.
If it wants to stop, it returns 0.
If it wants to continue, it returns 1.
If it wants a reference to the path then it can use (new)
vfs_legitimize_path() which might fail.
If it wants a reference to the path and may_sleep is true, it can use
path_get() which won't fail.
When returning -ECHILD (either because of a need to sleep or because
vfs_legitimize_path() fails), walk_cb() would reset_data().
NeilBrown
More information about the Linux-security-module-archive
mailing list