[PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

Siddh Raman Pant code at siddh.me
Wed Aug 3 08:40:06 UTC 2022


On Wed, 03 Aug 2022 11:11:31 +0530  Eric Biggers <ebiggers at kernel.org> wrote:
> I tested the syzbot reproducer
> https://syzkaller.appspot.com/text?tag=ReproC&x=174ea97e080000, and it does
> *not* trigger the bug on the latest upstream.  But, it does trigger the bug if I
> recent Linus's recent watch_queue fixes.
> 
> So I don't currently see any evidence of an unfixed bug.  If, nevertheless, you
> believe that a bug is still unfixed, then please provide a reproducer and a fix
> patch that clearly explains what it is fixing. 
> 
> > There is a null check in post_one_notification for the pipe, most probably
> > because it *expects* the pointer to be NULL'd. Also, there is no reason to have
> > a dangling pointer stay, it's just a recipe for further bugs.
> 
> If you want to send a patch or patches to clean up the code, that is fine, but
> please make it super clear what is a cleanup and what is a fix.
> 
> - Eric
> 

I honestly feel like I am repeating myself yet again, but okay.

Of course, the race condition has been solved by a patch upstream, which I had
myself mentioned earlier.

But what I am saying is that it did *not* address *what* that race condition
had triggered, i.e. the visible cause of the UAF crash, which, among other
things, is *because* there is a dangling pointer to the freed pipe, which
*caused* the crash in post_one_notification() when it tried to access
&pipe->rd.wait_lock as an argument to spin_lock_irq(), a path it reached
after checking if wqueue->pipe is NULL and proceeded when it was not the case.

And the upstream commit was made *after* I had posted this patch, hence this
was a fix for the syzkaller issue. While I am *not* saying to accept it just
because this was posted earlier, I am saying this patch addresses a parallel
issue, i.e. the *actual use-after-free crash* which was reproduced by those
reproducers, i.e., what was attempted to be used after getting freed and
detected by KASAN.

We don't need to wait for another similar syzbot report to pop up before doing
this change, and say let's not fix a dangling pointer reference because now
another commit apparately fixes the specific syzkaller issue, causing the given
specific reproducer with its specific way of reproducing to fail, when we in
fact now know it *can* be a valid problem in practice and doing this change
too causes the specific reproducer under consideration to fail reproducing, as
was reported by the reproducer itself.

I really don't know how to create stress tests / reproducers like how syzkaller
makes, so if a similar new reproducer is really required for showing this
patch's validity disregarding any earlier reproducers, I unfortunately cannot
make it due to skill issue as I just started in kernel dev, and I am deeply
sorry for wasting the time of everyone, and I am thankful for your criticism of
my patch.

Thanks,
Siddh



More information about the Linux-security-module-archive mailing list