[RFC PATCH -mm 0/4] mm, security, bpf: Fine-grained control over memory policy adjustments with lsm bpf

Michal Hocko mhocko at suse.com
Tue Nov 14 10:15:02 UTC 2023


On Mon 13-11-23 11:15:06, Yafang Shao wrote:
> On Mon, Nov 13, 2023 at 12:45 AM Casey Schaufler <casey at schaufler-ca.com> wrote:
> >
> > On 11/11/2023 11:34 PM, Yafang Shao wrote:
> > > Background
> > > ==========
> > >
> > > In our containerized environment, we've identified unexpected OOM events
> > > where the OOM-killer terminates tasks despite having ample free memory.
> > > This anomaly is traced back to tasks within a container using mbind(2) to
> > > bind memory to a specific NUMA node. When the allocated memory on this node
> > > is exhausted, the OOM-killer, prioritizing tasks based on oom_score,
> > > indiscriminately kills tasks. This becomes more critical with guaranteed
> > > tasks (oom_score_adj: -998) aggravating the issue.
> >
> > Is there some reason why you can't fix the callers of mbind(2)?
> > This looks like an user space configuration error rather than a
> > system security issue.
> 
> It appears my initial description may have caused confusion. In this
> scenario, the caller is an unprivileged user lacking any capabilities.
> While a privileged user, such as root, experiencing this issue might
> indicate a user space configuration error, the concerning aspect is
> the potential for an unprivileged user to disrupt the system easily.
> If this is perceived as a misconfiguration, the question arises: What
> is the correct configuration to prevent an unprivileged user from
> utilizing mbind(2)?"

How is this any different than a non NUMA (mbind) situation? You can
still have an unprivileged user to allocate just until the OOM triggers
and disrupt other workload consuming more memory. Sure the mempolicy
based OOM is less precise and it might select a victim with only a small
consumption on a target NUMA node but fundamentally the situation is
very similar. I do not think disallowing mbind specifically is solving a
real problem. 
-- 
Michal Hocko
SUSE Labs



More information about the Linux-security-module-archive mailing list