[PATCH v2 09/15] ovl: Simplify ovl_lookup_real_one()
Chris Mason
clm at meta.com
Mon Feb 23 13:13:37 UTC 2026
NeilBrown <neilb at ownmail.net> wrote:
> From: NeilBrown <neil at brown.name>
>
> The primary purpose of this patch is to remove the locking from
> ovl_lookup_real_one() as part of centralising all locking of directories
> for name operations.
>
> The locking here isn't needed. By performing consistency tests after
> the lookup we can be sure that the result of the lookup was valid at
> least for a moment, which is all the original code promised.
>
> lookup_noperm_unlocked() is used for the lookup and it will take the
> lock if needed only where it is needed.
>
> Also:
> - don't take a reference to real->d_parent. The parent is
> only use for a pointer comparison, and no reference is needed for
> that.
> - Several "if" statements have a "goto" followed by "else" - the
> else isn't needed: the following statement can directly follow
> the "if" as a new statement
> - Use a consistent pattern of setting "err" before performing a test
> and possibly going to "fail".
> - remove the "out" label (now that we don't need to dput(parent) or
> unlock) and simply return from fail:.
Hi everyone,
Amir asked me to run these through, and this commit was flagged. I recently
changed the prompts to include some vfs knowledge:
https://github.com/masoncl/review-prompts/blob/main/kernel/subsystem/vfs.md
And I also put in some words to make it prefer reporting problems that might
not be bugs, but that do violate some rule in the .md files. This might
be a false positive from that new shift toward blindly trusting the guides,
but I'll let everyone here decide:
commit d1bfcef048896cdc9e4c57e3249b6ad344d64c02
Author: NeilBrown <neil at brown.name>
ovl: Simplify ovl_lookup_real_one()
This patch removes locking from ovl_lookup_real_one() and reorganizes the
control flow. The function now uses lookup_noperm_unlocked() instead of
lookup_noperm() and performs consistency validation after the lookup rather
than holding a lock during the operation.
> diff --git a/fs/overlayfs/export.c b/fs/overlayfs/export.c
> index 83f80fdb15674..b448fc9424b64 100644
> --- a/fs/overlayfs/export.c
> +++ b/fs/overlayfs/export.c
> @@ -349,69 +349,64 @@ static struct dentry *ovl_dentry_real_at(struct dentry *dentry, int idx)
> return NULL;
> }
>
> -/*
> - * Lookup a child overlay dentry to get a connected overlay dentry whose real
> - * dentry is @real. If @real is on upper layer, we lookup a child overlay
> - * dentry with the same name as the real dentry. Otherwise, we need to consult
> - * index for lookup.
> +/**
> + * ovl_lookup_real_one - Lookup a child overlay dentry to get an overlay dentry whose real dentry is given
> + * @connected: parent overlay dentry
> + * @real: given child real dentry
> + * @layer: layer in which @real exists
> + *
> + *
> + * Lookup a child overlay dentry in @connected with the same name as the @real
> + * dentry. Then check that the parent of the result is the real dentry for
> + * @connected, and @real is the real dentry for the result.
> + *
> + * Returns:
> + * %-ECHILD if the parent of @real is no longer the real dentry for @connected.
> + * %-ESTALE if @real is no the real dentry of the found dentry.
^^
There's a typo here: "is no the" should be "is not the".
> + * Otherwise the found dentry is returned.
> */
> static struct dentry *ovl_lookup_real_one(struct dentry *connected,
> struct dentry *real,
> const struct ovl_layer *layer)
> {
> - struct inode *dir = d_inode(connected);
> - struct dentry *this, *parent = NULL;
> + struct dentry *this;
> struct name_snapshot name;
> int err;
>
> /*
> - * Lookup child overlay dentry by real name. The dir mutex protects us
> - * from racing with overlay rename. If the overlay dentry that is above
> - * real has already been moved to a parent that is not under the
> - * connected overlay dir, we return -ECHILD and restart the lookup of
> - * connected real path from the top.
> - */
> - inode_lock_nested(dir, I_MUTEX_PARENT);
> - err = -ECHILD;
> - parent = dget_parent(real);
> - if (ovl_dentry_real_at(connected, layer->idx) != parent)
> - goto fail;
> -
> - /*
> - * We also need to take a snapshot of real dentry name to protect us
> + * We need to take a snapshot of real dentry name to protect us
> * from racing with underlying layer rename. In this case, we don't
> * care about returning ESTALE, only from dereferencing a free name
> * pointer because we hold no lock on the real dentry.
> */
> take_dentry_name_snapshot(&name, real);
> - /*
> - * No idmap handling here: it's an internal lookup.
> - */
> - this = lookup_noperm(&name.name, connected);
> + this = lookup_noperm_unlocked(&name.name, connected);
> release_dentry_name_snapshot(&name);
> +
> + err = -ECHILD;
> + if (ovl_dentry_real_at(connected, layer->idx) != real->d_parent)
> + goto fail;
> +
A subsystem pattern flags this as potentially concerning: the VFS subsystem
guide indicates that TOCTOU validation checking d_parent should also check
d_unhashed() because a dentry can be unhashed (removed from dcache by
concurrent operations) while its d_parent pointer remains unchanged.
The old code held inode_lock_nested(dir, I_MUTEX_PARENT) during the entire
lookup and validation, which serialized with operations that could unhash the
dentry. The new lockless pattern creates a window where:
1. lookup_noperm_unlocked() completes and returns dentry `this`
2. A concurrent unlink/remove operation unhashes `this`
3. The d_parent validation passes (parent pointer unchanged)
4. The function returns an unhashed dentry
Other overlayfs code uses the complete pattern. For example, in
ovl_lock_rename_workdir() at fs/overlayfs/util.c:
if (work->d_parent != workdir || d_unhashed(work)) {
...
}
Could this validation miss a race where the dentry is unhashed but d_parent
still matches? The subsequent checks (!this->d_inode for -ENOENT,
wrong real dentry for -ESTALE) might catch some cases, but is that
sufficient for all the scenarios where an unhashed dentry could cause
problems?
> err = PTR_ERR(this);
> - if (IS_ERR(this)) {
> + if (IS_ERR(this))
> goto fail;
> - } else if (!this || !this->d_inode) {
> - dput(this);
> - err = -ENOENT;
> +
> + err = -ENOENT;
> + if (!this || !this->d_inode)
> goto fail;
> - } else if (ovl_dentry_real_at(this, layer->idx) != real) {
> - dput(this);
> - err = -ESTALE;
> +
> + err = -ESTALE;
> + if (ovl_dentry_real_at(this, layer->idx) != real)
> goto fail;
> - }
>
> -out:
> - dput(parent);
> - inode_unlock(dir);
> return this;
>
> fail:
> pr_warn_ratelimited("failed to lookup one by real (%pd2, layer=%d, connected=%pd2, err=%i)\n",
> real, layer->idx, connected, err);
> - this = ERR_PTR(err);
> - goto out;
> + if (!IS_ERR(this))
> + dput(this);
> + return ERR_PTR(err);
> }
>
> static struct dentry *ovl_lookup_real(struct super_block *sb,
More information about the Linux-security-module-archive
mailing list