[PATCH 1/2 v2] bcachefs: do not use PF_MEMALLOC_NORECLAIM
Kent Overstreet
kent.overstreet at linux.dev
Thu Aug 29 13:22:26 UTC 2024
On Thu, Aug 29, 2024 at 11:12:18PM GMT, Dave Chinner wrote:
> On Thu, Aug 29, 2024 at 06:02:32AM -0400, Kent Overstreet wrote:
> > On Wed, Aug 28, 2024 at 02:09:57PM GMT, Dave Chinner wrote:
> > > On Tue, Aug 27, 2024 at 08:15:43AM +0200, Michal Hocko wrote:
> > > > From: Michal Hocko <mhocko at suse.com>
> > > >
> > > > bch2_new_inode relies on PF_MEMALLOC_NORECLAIM to try to allocate a new
> > > > inode to achieve GFP_NOWAIT semantic while holding locks. If this
> > > > allocation fails it will drop locks and use GFP_NOFS allocation context.
> > > >
> > > > We would like to drop PF_MEMALLOC_NORECLAIM because it is really
> > > > dangerous to use if the caller doesn't control the full call chain with
> > > > this flag set. E.g. if any of the function down the chain needed
> > > > GFP_NOFAIL request the PF_MEMALLOC_NORECLAIM would override this and
> > > > cause unexpected failure.
> > > >
> > > > While this is not the case in this particular case using the scoped gfp
> > > > semantic is not really needed bacause we can easily pus the allocation
> > > > context down the chain without too much clutter.
> > > >
> > > > Acked-by: Christoph Hellwig <hch at lst.de>
> > > > Signed-off-by: Michal Hocko <mhocko at suse.com>
> > >
> > > Looks good to me.
> > >
> > > Reviewed-by: Dave Chinner <dchinner at redhat.com>
> >
> > Reposting what I wrote in the other thread:
>
> I've read the thread. I've heard what you have had to say. Like
> several other people, I think your position is just not practical or
> reasonable.
>
> I don't care about the purity or the safety of the API - the
> practical result of PF_MEMALLOC_NORECLAIM is that __GFP_NOFAIL
> allocation can now fail and that will cause unexpected kernel
> crashes. Keeping existing code and API semantics working correctly
> (i.e. regression free) takes precedence over new functionality or
> API features that people want to introduce.
>
> That's all there is to it. This is not a hill you need to die on.
If you use GFP_NOFAIL in a context where you're not allowed to sleep,
that's a bug, same as any other bug where you get the gfp flags wrong
(e.g. GFP_KERNEL in non sleepable context).
This isn't going to affect you unless you start going around inserting
PF_MEMALLOC_NORECLAIM where it doesn't need to be. Why would you do
that?
But the lack of gfp flags for pte allocation means that this actually is
a serious gap we need to be fixing.
More information about the Linux-security-module-archive
mailing list