[PATCH] RDMA/uverbs: Consider capability of the process that opens the file

Eric W. Biederman ebiederm at xmission.com
Wed Apr 30 03:34:41 UTC 2025


Parav Pandit <parav at nvidia.com> writes:

>> From: Eric W. Biederman <ebiederm at xmission.com>
>> Sent: Monday, April 28, 2025 10:34 PM
>
> [..]
>> > I said "user_ns of the netns"?  Credentials of the process is
>> > something else?
>> 
>> Exactly the credentials of the a process are not:
>> 	current->nsproxy->net_ns->user_ns;  /* Not this */
>> 
>> The credentials of a process are:
>> 	current->cred;  /* This */
>> 
>> With current->cred->user_ns the current processes user namespace.
>> 
> I am confused with your above response.
> In response [1], you described that net ns is the resource,
> hence resource's user namespace is considered.
> And your response [1] also aligns to existing code of [2] and many similar conversions done by your commit 276996fda0f33.
>
> [1] https://lore.kernel.org/linux-rdma/87ikmnd3j6.fsf@email.froward.int.ebiederm.org/T/#me5983d8248de0ff9670644c57d71009debaedd6f
> [2] https://elixir.bootlin.com/linux/v6.14.3/source/net/ipv4/af_inet.c#L314
>
> So in infiniband, when I replace existing capable() with ns_capable(), 
> shouldn't I use current->nsproxy->net_ns->user_ns following [1] and
> [2], because for infiniband too, the resource is net namespace.

Almost.

It is true that current->nsproxy->net_ns matches ib_device->net_ns at
open time, but those permission checks don't happen at open time.

After open time you want ib_device->net_ns.  Not
current->nsproxy->net_ns.

At which point your ns_capable call will look something like:

	ns_capable(ib_device->net_ns->user_ns, CAP_NET_RAW);

That ns_capable call will then check

ib_device->net_ns->user_ns against
current->cred->user_ns.

And it will verify that CAP_NET_RAW is in
current->cred->cap_effect.

Thus checking the resource (the ib_device) against the current
process's credentials.

----

The danger of using current->nsproxy->net_ns->user ns after
open time is the caller may have done.

unshare(CLONE_NEWUSER);
unshare(CLONE_NEWNET);

At which point
"ns_capable(current->nsproxy->net_ns->user_ns, CAP_NET_RAW)"
is guaranteed to be true.

But it isn't meaningful because there are be no ib_devices in that
network namespace.

----

Because of the shared device stuff a relaxed permission check
would actually need to look more like.

	struct user_ns *user_ns = shared ? &init_user_ns : ib_device->net_ns->user_ns;
        ns_capable(user_ns, CAP_NET_RAW);

This allows sharing the capable call for better maintenance but only
relaxing the permission check for the other cases.

Eric





More information about the Linux-security-module-archive mailing list