[RFC PATCH 3/3] Restart pathwalk on rename seqcount change
Tingmao Wang
m at maowtm.org
Wed Jun 4 18:56:09 UTC 2025
On 6/4/25 03:21, Al Viro wrote:
> On Wed, Jun 04, 2025 at 02:12:11AM +0100, Tingmao Wang wrote:
>> On 6/4/25 01:55, Al Viro wrote:
>>> On Wed, Jun 04, 2025 at 01:45:45AM +0100, Tingmao Wang wrote:
>>>> + rename_seqcount = read_seqbegin(&rename_lock);
>>>> + if (rename_seqcount % 2 == 1) {
>>>
>>> Please, describe the condition when that can happen, preferably
>>> along with a reproducer.
>>
>> My understanding is that when a rename is in progress the seqcount is odd,
>> is that correct?
>>
>> If that's the case, then the fs_race_test in patch 2 should act as a
>> reproducer, since it's constantly moving the directory.
>>
>> I can add a comment to explain this, thanks for pointing out.
>
> Please, read through the header declaring those primitives and read the
> documentation it refers to - it's useful for background.
Ok, so I didn't realize read_seqbegin actually waits for the seqcount to
turn even. I did read the header earlier when following dget_parent but
probably misremembered and mixed raw_seqcount_begin with read_seqbegin.
>
> What's more, look at the area covered by rename_lock - I seriously suspect
> that you are greatly overestimating it.
Admittedly "when a rename is in progress" is a vague statement. Looking
at what takes rename_lock in the code, it's only when we actually do
d_move where we take this lock (plus some other places), and the critical
section isn't very large, and does not contain any waits etc.
If we keep read_seqbegin, then that gives landlock more opportunity to do
a reference-less parent walk, but at the expense that a d_move anywhere,
even if it doesn't affect anything we're currently looking at, will
temporarily block this landlocked application (even if not for very long),
and multiple concurrent renames might cause us to wait for longer (but
probably won't starve us since we just need one "cycle" where rename
seqcount is even).
Since we can still safely do a parent walk, just needing to take dentry
references on our way, we could simply fallback to that in this situation.
i.e. we can use raw_seqcount_begin and keep the seqcount & 1 check.
Now, there is the argument that if d_move is very quick, then it might be
worth waiting for it to finish, and we will fallback to the original
parent walk if the seqcount changes again. I'm not sure which is best,
but I'm inclining towards changing this to raw_seqcount_begin, as this is
purely an optimization, and we do not _need_ to avoid concurrent renames.
More information about the Linux-security-module-archive
mailing list