[PATCH v3] security: Expand task_setscheduler LSM hook to include CPU affinity mask

Aaron Tomlin atomlin at atomlin.com
Mon Jun 15 15:22:08 UTC 2026


On Wed, May 27, 2026 at 09:19:11PM -0400, Aaron Tomlin wrote:
> On Wed, May 27, 2026 at 09:58:58PM +0200, Peter Zijlstra wrote:
> > On Wed, May 27, 2026 at 01:41:52PM -0400, Aaron Tomlin wrote:
> > 
> > > > > The actual use case here is multi-tenant workload isolation and visibility.
> > > > > Passing the evaluated cpumask to the BPF LSM allows operators to write a
> > > > > simple eBPF program to detect spatial boundary overlaps (e.g., logging an
> > > > > event if a requested mask intersects with platform-reserved cores).
> > 
> > Why isn't cgroups good enough to enforce this? If you create a cgroup
> > hierarchy per tenant, and constrain them using the cpuset controller,
> > they should not be able to escape, rendering this event impossible.
> 
> Hi Peter,
> 
> You raise a very fair point. The cpuset cgroup controller is indeed the
> kernel's primary vehicle for spatial enforcement, and under normal
> circumstances, it successfully prevents a tenant from escaping their
> designated cores.
> 
> The cpuset controller does govern resource limits, but does not audit
> intent. When __sched_setaffinity() is invoked, the kernel compares the
> requested in_mask against the task's allowed cpuset. If there is only a
> partial intersection, the kernel silently truncates the requested mask to
> fit the cpuset, without raising any alarm.
> 
> The BPF LSM hook, conversely, receives the raw, untruncated in_mask,
> affording operators the visibility to detect, audit, and even reject these
> violations of intent before the kernel silently sanitises the input.
> 
> This patch does not seek to replace the cpuset controller, but rather to
> complement it by providing auditing capabilities.
> 
> > > We are not creating a bespoke BPF hook here; rather, we are rectifying a
> > > historical blind spot within the API. The existing LSM hook is invoked
> > > during sched_setaffinity(), yet it presently receives only the task_struct
> > > pointer. Consequently, the security module is essentially asked, "Should
> > > Process A be permitted to alter Process B's affinity?" without being
> > > informed of the proposed affinity itself. Providing in_mask simply
> > > furnishes the existing hook with the requisite payload to make an informed
> > > decision.
> > 
> > It occurs to me that this same argument would require to also pass in
> > the new sched_attr, no? That way the LSM can inspect the new policy
> > before it becomes effective.
> 
> I agree, the underlying logic does indeed extend perfectly to sched_attr.
> 
> Presently, the LSM is equally oblivious as to whether a process is
> requesting a benign transition to SCHED_BATCH, or attempting to escalate
> its privileges by requesting a real-time policy such as SCHED_FIFO with
> maximum priority. Just as with the CPU mask, providing the sched_attr
> payload would rectify this parallel blind spot, allowing BPF policies to
> inspect and mediate scheduling attributes before they become effective.
> 
> If you are amenable, I should be more than happy to expand the scope of the
> forthcoming patch to include this. Alternatively, we could address the
> sched_attr expansion in a separate, subsequent patch. Personally, I would
> favour the latter approach, but please do let me know your preference.
> 
> I very much look forward to hearing Paul's thoughts on whether this aligns
> with the broader LSM vision.

Hi Paul,

I am writing to politely follow up on the discussion above regarding the
proposed enhancement to the sched_setaffinity LSM hook.

As you will see from the thread, Peter Zijlstra and I have discussed the
architectural justification for this change. While the cpuset cgroup
controller effectively handles spatial enforcement, it silently truncates
requested affinity masks. Passing the raw in_mask to the LSM hook enables
security modules (such as the BPF LSM) to audit and mediate the actual
intent of the request before the kernel sanitises the input, a capability
that cgroups inherently lack.

Furthermore, Peter rightly observed that this reasoning extends naturally
to sched_attr. Presently, the LSM cannot inspect whether a process is
requesting a benign scheduling policy or attempting to escalate to a
real-time priority. I am entirely amenable to addressing this parallel
blind spot, preferably in a subsequent patch.

Before I proceed any further, I would be most grateful for your perspective
as the Security sub-system maintainer. Do you feel this expansion is
acceptable?

As a brief administrative aside, please note that Thomas Bogendoerfer has
already queued the MIPS-specific changes related to this work into the
mips-next tree [1][2].

I look forward to hearing your thoughts.

[1]: https://lore.kernel.org/lkml/psb6pxogv2dlknps4p3sh6rt2h7xuuxkoif6ock5vxfz2jimec@txa6iy65crtb/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/mips/linux.git/commit/?id=98e37db4a34d3af3fb2f4648295c25b5e40b20e3


Kind regards,
-- 
Aaron Tomlin



More information about the Linux-security-module-archive mailing list