[PATCH v7 1/2] mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options
Alexander Potapenko
glider at google.com
Fri Jun 21 14:10:19 UTC 2019
On Fri, Jun 21, 2019 at 10:57 AM Alexander Potapenko <glider at google.com> wrote:
>
> On Fri, Jun 21, 2019 at 9:09 AM Michal Hocko <mhocko at kernel.org> wrote:
> >
> > On Mon 17-06-19 17:10:49, Alexander Potapenko wrote:
> > > The new options are needed to prevent possible information leaks and
> > > make control-flow bugs that depend on uninitialized values more
> > > deterministic.
> > >
> > > init_on_alloc=1 makes the kernel initialize newly allocated pages and heap
> > > objects with zeroes. Initialization is done at allocation time at the
> > > places where checks for __GFP_ZERO are performed.
> > >
> > > init_on_free=1 makes the kernel initialize freed pages and heap objects
> > > with zeroes upon their deletion. This helps to ensure sensitive data
> > > doesn't leak via use-after-free accesses.
> > >
> > > Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator
> > > returns zeroed memory. The two exceptions are slab caches with
> > > constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never
> > > zero-initialized to preserve their semantics.
> > >
> > > Both init_on_alloc and init_on_free default to zero, but those defaults
> > > can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and
> > > CONFIG_INIT_ON_FREE_DEFAULT_ON.
> > >
> > > Slowdown for the new features compared to init_on_free=0,
> > > init_on_alloc=0:
> > >
> > > hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%)
> > > hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%)
> > >
> > > Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%)
> > > Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%)
> > > Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%)
> > > Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%)
> > >
> > > The slowdown for init_on_free=0, init_on_alloc=0 compared to the
> > > baseline is within the standard error.
> > >
> > > The new features are also going to pave the way for hardware memory
> > > tagging (e.g. arm64's MTE), which will require both on_alloc and on_free
> > > hooks to set the tags for heap objects. With MTE, tagging will have the
> > > same cost as memory initialization.
> > >
> > > Although init_on_free is rather costly, there are paranoid use-cases where
> > > in-memory data lifetime is desired to be minimized. There are various
> > > arguments for/against the realism of the associated threat models, but
> > > given that we'll need the infrastructre for MTE anyway, and there are
> > > people who want wipe-on-free behavior no matter what the performance cost,
> > > it seems reasonable to include it in this series.
> >
> > Thanks for reworking the original implemenation. This looks much better!
> >
> > > Signed-off-by: Alexander Potapenko <glider at google.com>
> > > Acked-by: Kees Cook <keescook at chromium.org>
> > > To: Andrew Morton <akpm at linux-foundation.org>
> > > To: Christoph Lameter <cl at linux.com>
> > > To: Kees Cook <keescook at chromium.org>
> > > Cc: Masahiro Yamada <yamada.masahiro at socionext.com>
> > > Cc: Michal Hocko <mhocko at kernel.org>
> > > Cc: James Morris <jmorris at namei.org>
> > > Cc: "Serge E. Hallyn" <serge at hallyn.com>
> > > Cc: Nick Desaulniers <ndesaulniers at google.com>
> > > Cc: Kostya Serebryany <kcc at google.com>
> > > Cc: Dmitry Vyukov <dvyukov at google.com>
> > > Cc: Sandeep Patil <sspatil at android.com>
> > > Cc: Laura Abbott <labbott at redhat.com>
> > > Cc: Randy Dunlap <rdunlap at infradead.org>
> > > Cc: Jann Horn <jannh at google.com>
> > > Cc: Mark Rutland <mark.rutland at arm.com>
> > > Cc: Marco Elver <elver at google.com>
> > > Cc: linux-mm at kvack.org
> > > Cc: linux-security-module at vger.kernel.org
> > > Cc: kernel-hardening at lists.openwall.com
> >
> > Acked-by: Michal Hocko <mhocko at suse.cz> # page allocator parts.
> >
> > kmalloc based parts look good to me as well but I am not sure I fill
> > qualified to give my ack there without much more digging and I do not
> > have much time for that now.
> >
> > [...]
> > > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > > index fd5c95ff9251..2f75dd0d0d81 100644
> > > --- a/kernel/kexec_core.c
> > > +++ b/kernel/kexec_core.c
> > > @@ -315,7 +315,7 @@ static struct page *kimage_alloc_pages(gfp_t gfp_mask, unsigned int order)
> > > arch_kexec_post_alloc_pages(page_address(pages), count,
> > > gfp_mask);
> > >
> > > - if (gfp_mask & __GFP_ZERO)
> > > + if (want_init_on_alloc(gfp_mask))
> > > for (i = 0; i < count; i++)
> > > clear_highpage(pages + i);
> > > }
> >
> > I am not really sure I follow here. Why do we want to handle
> > want_init_on_alloc here? The allocated memory comes from the page
> > allocator and so it will get zeroed there. arch_kexec_post_alloc_pages
> > might touch the content there but is there any actual risk of any kind
> > of leak?
> You're right, we don't want to initialize this memory if init_on_alloc is on.
> We need something along the lines of:
> if (!static_branch_unlikely(&init_on_alloc))
> if (gfp_mask & __GFP_ZERO)
> // clear the pages
>
> Another option would be to disable initialization in alloc_pages() using a flag.
> >
> > > diff --git a/mm/dmapool.c b/mm/dmapool.c
> > > index 8c94c89a6f7e..e164012d3491 100644
> > > --- a/mm/dmapool.c
> > > +++ b/mm/dmapool.c
> > > @@ -378,7 +378,7 @@ void *dma_pool_alloc(struct dma_pool *pool, gfp_t mem_flags,
> > > #endif
> > > spin_unlock_irqrestore(&pool->lock, flags);
> > >
> > > - if (mem_flags & __GFP_ZERO)
> > > + if (want_init_on_alloc(mem_flags))
> > > memset(retval, 0, pool->size);
> > >
> > > return retval;
> >
> > Don't you miss dma_pool_free and want_init_on_free?
> Agreed.
> I'll fix this and add tests for DMA pools as well.
This doesn't seem to be easy though. One needs a real DMA-capable
device to allocate using DMA pools.
On the other hand, what happens to a DMA pool when it's destroyed,
isn't it wiped by pagealloc?
I'm inclined towards not touching mm/dmapool.c in this patch series,
as it is probably orthogonal to the idea of hardening the
heap/pagealloc.
> > --
> > Michal Hocko
> > SUSE Labs
>
>
>
> --
> Alexander Potapenko
> Software Engineer
>
> Google Germany GmbH
> Erika-Mann-Straße, 33
> 80636 München
>
> Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
--
Alexander Potapenko
Software Engineer
Google Germany GmbH
Erika-Mann-Straße, 33
80636 München
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
More information about the Linux-security-module-archive
mailing list