[PATCH 1/3] cgroup/cpuset: Fix deadline bandwidth leak in cpuset_can_attach()

Mon May 11 17:54:37 UTC 2026

On 5/11/26 7:08 AM, Aaron Tomlin wrote:
> On Mon, May 11, 2026 at 01:10:02AM -0400, Waiman Long wrote:
>> On 5/9/26 12:48 PM, Aaron Tomlin wrote:
>>> During a cgroup migration, cpuset_can_attach() iterates over the
>>> provided taskset. If a task within the batch is a deadline (DL) task,
>>> the destination cpuset's DL metrics (i.e., nr_migrate_dl_tasks and
>>> sum_migrate_dl_bw) are appropriately incremented.
>>>
>>> However, if a subsequent task in the same migration batch fails the
>>> task_can_attach() check, the loop aborts and jumps directly to
>>> out_unlock. Consequently, any DL metrics accumulated from previously
>>> processed tasks in the batch remain permanently inflated in the
>>> destination cpuset. Because the migration is subsequently aborted by the
>>> cgroup core, cpuset_cancel_attach() is never invoked to unwind these
>>> specific increments.
>>>
>>> This behaviour results in a permanent leak of deadline bandwidth, which
>>> incorrectly restricts the admission control capacity of the destination
>>> cpuset.
>>>
>>> To resolve this, introduce an out_unlock_reset failure path that
>>> conditionally invokes reset_migrate_dl_data(). This guarantees that if a
>>> batch migration is aborted for any reason, the pending DL metrics are
>>> safely reset before returning the error.
>>>
>>> Fixes: 0a67b847e1f06 ("cpuset: Allow setscheduler regardless of manipulated task")
>> That is not the commit that introduced the bug. Anyway, there is already
>> another patch sent recently to fix this bug. See
>>
>> https://lore.kernel.org/lkml/20260509102031.97608-2-zhangguopeng@kylinos.cn/
>>
> Hi Waiman,
>
> Thank you for the follow up.
>
> Acknowledged. I will drop this patch in the next iteration due to [1].
>
> Please note, the sashiko AI Review bot reported: cpuset_can_attach()
> incorrectly assumes all migrating tasks originate from the same source
> cpuset. At first glance, this feedback is valid. I plan to submit a patch,
> if no solution was already proposed.
>
> [1]: https://lore.kernel.org/lkml/20260509102031.97608-2-zhangguopeng@kylinos.cn/

Yes, it does look like the AI feedback is valid. I will take a further 
look into this.

Thanks,
Longman