[PATCH] security/commoncap: don't assume "setid" if all ids are identical

Thu May 8 03:32:25 UTC 2025

I'm looking at this code:

        /* Check for privilege-elevated exec. */
        if (is_setid ||
            (!__is_real(root_uid, new) &&
             (effective ||
              __cap_grew(permitted, ambient, new))))
                bprm->secureexec = 1;

If a luser runs a setuid program, then the kernel will set this
bprm->secureexec bit. Indeed, every time this program re-exec's
itself, that bit will consistently be set. Today.

However, with your change, that behavior will change. The first time
the program is exec'd by luser this bit will be set. However, it will
"surprisingly" not occur should the program exec itself again. If you
are unaware of this bit's purpose there is a nice writeup here:

https://www.kernel.org/doc/html/v5.1/security/LSM.html

See the "bprm_set_creds" paragraph. My concern is that there is an
exploit vector associated with an abuser setting LD_LIBRARY_PATH= to
something nefarious, and then invoking a setuid program that happens
to re-exec itself for some reason. The first invocation will be as
before, but when the binary re-exec's itself, I am concerned that this
could cause the privileged binary to load an exploit.

This has nothing to do with your interest in NO_NEW_PRIV but more to
do with the legacy behavior changes like this are exposed to.

Cheers

Andrew

On Tue, May 6, 2025 at 11:33 PM Max Kellermann <max.kellermann at ionos.com> wrote:
>
> On Wed, May 7, 2025 at 5:16 AM Andrew G. Morgan <morgan at kernel.org> wrote:
> > If a setuid program execs itself, does the presence of this code undo
> > any protection the kernel afforded it on its first invocation?
>
> What protection do you mean, and what behavior do you expect when
> setid execs itself? I see this affects:
>
> 1. reset effective ids to real ids (only affects NO_NEW_PRIVS)
> 2. new cap_permitted cannot be higher than old cap_permitted
> 3. clear cap_ambient
> 4. clear pdeath_signal (in begin_new_exec)
> 5. reset stack limits (in begin_new_exec)
>
> About these (from my very limited knowledge of this part of the kernel):
>
> 1. is my primary goal, and really no new privs gained by allowing the
> process to keep existing ids
> 2. only ever changes anything if new cap_permitted is higher, but if
> that's the case, the is_setid check is irrelevant because __cap_gained
> is true, therefore no change with my patch
> 3. as I already described, the kernel is wrong (or the documentation
> is wrong), and my patch adjusts kernel to behave as documented
> 4. I don't see how this is dangerous for anything regarding re-exec;
> if pdeath_signal wasn't reset on the first exec, it's safe to keep it
> after the re-exec, too
> 5. same as 4, I think
>
> Did I miss anything?
>
> > FWIW I ran the libcap quicktest.sh script against your change and it
> > doesn't break any capability thing I test for when making libcap
> > releases.
>
> Thanks for taking the time to run these tests. I'm glad the existing
> tests didn't find any obvious bugs. If we identify an actual problem
> with my patch, let's write a new test that fails with my patch!
>
> > That being said, in general this whole zoo of *[ug]id values
> > and their subtle behavior is not casually obvious. I'd recommend using
> > a userspace workaround for your use case, and not changing the legacy
> > behavior of the kernel.
>
> What userspace workaround would be possible? The only thing that comes
> to my mind is to apply NO_NEW_PRIVS in the child process after exec -
> but that's too late, because arbitrary untrusted code has already been
> executed at this point. This means I lose NO_NEW_PRIVS completely,
> because the attacker can simply skip the call.
>
> Max