[PATCH 1/6] fs/exec: Drop task_lock() inside __get_task_comm()
Alexei Starovoitov
alexei.starovoitov at gmail.com
Sun Jun 2 18:23:17 UTC 2024
On Sun, Jun 2, 2024 at 10:53 AM Eric W. Biederman <ebiederm at xmission.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov at gmail.com> writes:
>
> > On Sat, Jun 1, 2024 at 11:57 PM Yafang Shao <laoar.shao at gmail.com> wrote:
> >>
> >> On Sun, Jun 2, 2024 at 11:52 AM Eric W. Biederman <ebiederm at xmission.com> wrote:
> >> >
> >> > Yafang Shao <laoar.shao at gmail.com> writes:
> >> >
> >> > > Quoted from Linus [0]:
> >> > >
> >> > > Since user space can randomly change their names anyway, using locking
> >> > > was always wrong for readers (for writers it probably does make sense
> >> > > to have some lock - although practically speaking nobody cares there
> >> > > either, but at least for a writer some kind of race could have
> >> > > long-term mixed results
> >> >
> >> > Ugh.
> >> > Ick.
> >> >
> >> > This code is buggy.
> >> >
> >> > I won't argue that Linus is wrong, about removing the
> >> > task_lock.
> >> >
> >> > Unfortunately strscpy_pad does not work properly with the
> >> > task_lock removed, and buf_size larger that TASK_COMM_LEN.
> >> > There is a race that will allow reading past the end
> >> > of tsk->comm, if we read while tsk->common is being
> >> > updated.
> >>
> >> It appears so. Thanks for pointing it out. Additionally, other code,
> >> such as the BPF helper bpf_get_current_comm(), also uses strscpy_pad()
> >> directly without the task_lock. It seems we should change that as
> >> well.
> >
> > Hmm. What race do you see?
> > If lock is removed from __get_task_comm() it probably can be removed from
> > __set_task_comm() as well.
> > And both are calling strscpy_pad to write and read comm.
> > So I don't see how it would read past sizeof(comm),
> > because 'buf' passed into __set_task_comm is NUL-terminated.
> > So the concurrent read will find it.
>
> The read may race with a write that is changing the location
> of '\0'. Especially if the new value is shorter than
> the old value.
so ?
strscpy_pad in __[gs]et_task_comm will read/write either long
or byte at a time.
Assume 64 bit and, say, we had comm where 2nd u64 had NUL.
Now two cpus are racing. One is writing shorter comm.
Another is reading.
The latter can read 1st u64 without NUL and will proceed
to read 2nd u64. Either it will read the old u64 with NUL in it
or it will read all zeros in 2nd u64 or some zeros in 2nd u64
depending on how the compiler generated memset(.., 0, ..)
as part of strscpy_pad().
_pad() part is critical here.
If it was just strscpy() then there would indeed be a chance
of reading both u64-s and not finding NUL in any of them.
> If you are performing lockless reads and depending upon a '\0'
> terminator without limiting yourself to the size of the buffer
> there needs to be a big fat comment as to how in the world
> you are guaranteed that a '\0' inside the buffer will always
> be found.
I think Yafang can certainly add such a comment next to
__[gs]et_task_comm.
I prefer to avoid open coding memcpy + mmemset when strscpy_pad works.
More information about the Linux-security-module-archive
mailing list