[GIT PULL] Block fixes for 6.18-rc3

Fri Oct 24 20:31:11 UTC 2025

[ Adding LSM people. Also Christian, because he did the cred refcount
cleanup with override_creds() and friends last year, and I'm
suggesting taking that one step further ]

On Fri, 24 Oct 2025 at 06:58, Jens Axboe <axboe at kernel.dk> wrote:
>
> Ondrej Mosnacek (1):
>       nbd: override creds to kernel when calling sock_{send,recv}msg()

I've pulled this, but looking at the patch, I note that more than half
the patch - 75% to be exact - is just boilerplate for "I need to
allocate the kernel cred and deal with error handling there".

It literally has three lines of new actual useful code (two statements
and one local variable declaration), and then nine lines of the "setup
dance".

Which isn't wrong, but when the infrastructure boilerplate is three
times more than the actual code, it makes me think we should maybe
just get rid of the

    my_kernel_cred = prepare_kernel_cred(&init_task);

pattern for this use-case, and just let people use "init_cred"
directly for things like this.

Because that's essentially what that prepare_kernel_cred() thing
returns, except it allocates a new copy of said thing, so now you have
error handling and you have to free it after-the-fact.

And I'm not seeing that the extra error handling and freeing dance
actually buys us anything at all.

Now, some *other* users actually go on to change the creds: they want
that prepare_kernel_cred() dance because they then actually do
something else like using their own keyring or whatever (eg the NFS
idmap code or some other filesystem stuff).

So it's not like prepare_kernel_cred() is wrong, but in this kind of
case where people just go "I'm a driver with hardware access, I want
to do something with kernel privileges not user privileges", it
actually seems counterproductive to have extra code just to complicate
things.

Now, my gut feel is that if we just let people use 'init_cred'
directly, we should also make sure that it's always exposed as a
'const struct cred' , but wouldn't that be a whole lot simpler and
more straightforward?

This is *not* the only use case of that.

We now have at least four use-cases of this "raw kernel cred" pattern:
core-dumping over unix domain socket, nbd, firmware loading and SCSI
target all do this exact thing as far as I can tell.

So  they all just want that bare kernel cred, and this interface then
forces it to do extra work instead of just doing

        old_cred = override_creds(&init_cred);
        ...
        revert_creds(old_cred);

and it ends up being extra code for allocating and freeing that copy
of a cred that we already *had* and could just have used directly.

I did just check that making 'init_cred' be const

  --- a/include/linux/init_task.h
  +++ b/include/linux/init_task.h
  @@ -28 +28 @@ extern struct nsproxy init_nsproxy;
  -extern struct cred init_cred;
  +extern const struct cred init_cred;
  --- a/kernel/cred.c
  +++ b/kernel/cred.c
  @@ -44 +44 @@ static struct group_info init_groups = { .usage =
REFCOUNT_INIT(2) };
  -struct cred init_cred = {
  +const struct cred init_cred = {

seems to build just fine and would seem to be the right thing to do
even if we *don't* expect people to use it. And override_creds() is
perfectly happy with a

Maybe there's some reason for that extra work that I'm not seeing and
thinking of? But it all smells like make-believe work to me that
probably has a historical reason for it, but doesn't seem to make a
lot of sense any more.

Hmm?

               Linus