[PATCH bpf-next 03/29] bpf: introduce BPF token object
Alexei Starovoitov
alexei.starovoitov at gmail.com
Fri Jan 5 22:27:43 UTC 2024
On Fri, Jan 5, 2024 at 2:06 PM Andrii Nakryiko
<andrii.nakryiko at gmail.com> wrote:
>
> On Fri, Jan 5, 2024 at 12:26 PM Linus Torvalds
> <torvalds at linuxfoundation.org> wrote:
> >
> > I'm still looking through the patches, but in the early parts I do
> > note this oddity:
> >
> > On Wed, 3 Jan 2024 at 14:21, Andrii Nakryiko <andrii at kernel.org> wrote:
> > >
> > > +struct bpf_token {
> > > + struct work_struct work;
> > > + atomic64_t refcnt;
> > > + struct user_namespace *userns;
> > > + u64 allowed_cmds;
> > > +};
> >
> > Ok, not huge, and makes sense, although I wonder if that
> >
> > atomic64_t refcnt;
> >
> > should just be 'atomic_long_t' since presumably on 32-bit
> > architectures you can't create enough references for a 64-bit atomic
> > to make much sense.
> >
> > Or are there references to tokens that might not use any memory?
> >
> > Not a big deal, but 'atomic64_t' is very expensive on 32-bit
> > architectures, and doesn't seem to make much sense unless you really
> > specifically need 64 bits for some reason.
>
> I used atomic64_t for consistency with other BPF objects (program,
> etc) and not to have to worry even about hypothetical overflows.
> 32-bit atomic performance doesn't seem to be a big concern as a token
> is passed into a pretty heavy-weight operations that create new BPF
> object (map, program, BTF object), no matter how slow refcounting is
> it will be lost in the noise for those operations.
To add a bit more context here...
Back in 2016 Jann managed to overflow 32-bit prog/map counters:
"
On a system with >32Gbyte of physical memory,
the malicious application may overflow 32-bit bpf program refcnt.
It's also possible to overflow map refcnt on 1Tb system.
"
We mitigated that with fixed limits:
- atomic_inc(&map->refcnt);
+ if (atomic_inc_return(&map->refcnt) > BPF_MAX_REFCNT) {
+ atomic_dec(&map->refcnt);
+ return ERR_PTR(-EBUSY);
+ }
but it created quite a lot of error handling pain throughout
the code, so eventually in 2019 we switched to atomic64_t refcnt
and never looked back.
I suspect Jann will be able to overflow 32-bit token refcnt,
so atomic64 was chosen for simplicity.
atomic_long_t might work too, but the effort to think it through
is not worth it at this point, since performance of
inc/dec doesn't matter here.
Eventually we can do a follow up and consistently update
all such counters to atomic_long_t.
More information about the Linux-security-module-archive
mailing list