[PATCH 0/2 v2] remove PF_MEMALLOC_NORECLAIM
Kent Overstreet
kent.overstreet at linux.dev
Thu Sep 5 14:05:15 UTC 2024
On Thu, Sep 05, 2024 at 09:53:26AM GMT, Theodore Ts'o wrote:
> On Thu, Sep 05, 2024 at 01:26:50PM +0200, Michal Hocko wrote:
> > > > > > This is exactly GFP_KERNEL semantic for low order allocations or
> > > > > > kvmalloc for that matter. They simply never fail unless couple of corner
> > > > > > cases - e.g. the allocating task is an oom victim and all of the oom
> > > > > > memory reserves have been consumed. This is where we call "not possible
> > > > > > to allocate".
> > > > >
> > > > > Which does beg the question of why GFP_NOFAIL exists.
> > > >
> > > > Exactly for the reason that even rare failure is not acceptable and
> > > > there is no way to handle it other than keep retrying. Typical code was
> > > > while (!(ptr = kmalloc()))
> > > > ;
> > >
> > > But is it _rare_ failure, or _no_ failure?
> > >
> > > You seem to be saying (and I just reviewed the code, it looks like
> > > you're right) that there is essentially no difference in behaviour
> > > between GFP_KERNEL and GFP_NOFAIL.
>
> That may be the currrent state of affiars; but is it
> ****guaranteed**** forever and ever, amen, that GFP_KERNEL will never
> fail if the amount of memory allocated was lower than a particular
> multiple of the page size? If so, what is that size? I've checked,
> and this is not documented in the formal interface.
Yeah, and I think we really need to make that happen, in order to head
off a lot more sillyness in the future.
We'd also be documenting at the same time _exactly_ when it is required
to check for errors:
- small, fixed sized allocation in a known sleepable context, safe to skip
- anything else, i.e. variable sized allocation or library code that can
be called from different contexts: you check for errors (and probably
that's just "something crazy has happened, emergency shutdown" for the
xfs/ext4 paths
> > The fundamental difference is that (appart from unsupported allocation
> > mode/size) the latter never returns NULL and you can rely on that fact.
> > Our docummentation says:
> > * %__GFP_NOFAIL: The VM implementation _must_ retry infinitely: the caller
> > * cannot handle allocation failures. The allocation could block
> > * indefinitely but will never return with failure. Testing for
> > * failure is pointless.
>
> So if the documentation is going to give similar guarantees, as
> opposed to it being an accident of the current implementation that is
> subject to change at any time, then sure, we can probably get away
> with all or most of ext4's uses of __GFP_NOFAIL. But I don't want to
> do that and then have a "Lucy and Charlie Brown" moment from the
> Peanuts comics strip where the football suddenly gets snatched away
> from us[1] (and many file sytem users will be very, very sad and/or
> angry).
yeah absolutely, and the "what is a small allocation" limit needs to be
nailed down as well
More information about the Linux-security-module-archive
mailing list