[PATCH 1/2 v2] bcachefs: do not use PF_MEMALLOC_NORECLAIM

Kent Overstreet kent.overstreet at linux.dev
Thu Aug 29 10:02:32 UTC 2024


On Wed, Aug 28, 2024 at 02:09:57PM GMT, Dave Chinner wrote:
> On Tue, Aug 27, 2024 at 08:15:43AM +0200, Michal Hocko wrote:
> > From: Michal Hocko <mhocko at suse.com>
> > 
> > bch2_new_inode relies on PF_MEMALLOC_NORECLAIM to try to allocate a new
> > inode to achieve GFP_NOWAIT semantic while holding locks. If this
> > allocation fails it will drop locks and use GFP_NOFS allocation context.
> > 
> > We would like to drop PF_MEMALLOC_NORECLAIM because it is really
> > dangerous to use if the caller doesn't control the full call chain with
> > this flag set. E.g. if any of the function down the chain needed
> > GFP_NOFAIL request the PF_MEMALLOC_NORECLAIM would override this and
> > cause unexpected failure.
> > 
> > While this is not the case in this particular case using the scoped gfp
> > semantic is not really needed bacause we can easily pus the allocation
> > context down the chain without too much clutter.
> > 
> > Acked-by: Christoph Hellwig <hch at lst.de>
> > Signed-off-by: Michal Hocko <mhocko at suse.com>
> 
> Looks good to me.
> 
> Reviewed-by: Dave Chinner <dchinner at redhat.com>

Reposting what I wrote in the other thread:

This series is based on a fundamental misunderstanding of what a safe
API is: a safe API is not one that doesn't return errors, it's one that
never invokes undefined behaviour.

It was decided years ago that the scoped APIs were the better
replacement for the gfp APIs, and they need to exist.

This "GFP_NOFAIL exists therefore we can't tell the memory allocator
about situations wehre it would have to fail" is utter insanity - it's
the exact opposite of defining a safe API.

A safe API would be one where we /did/ always tell the memory allocator
when we're in non-sleepable context, and the allocator always returned
failure instead of context switching.

This is utter brain damage; rule #1 of kernel programming is that _you
check for errors_. If you don't know that your GFP_NOFAIL usage is in a
safe context (and that's not just memory reclaim context, it's also the
size of your alloction) then you have to check for errors.



More information about the Linux-security-module-archive mailing list