[PATCH] RDMA/uverbs: Consider capability of the process that opens the file
Eric W. Biederman
ebiederm at xmission.com
Wed Apr 30 03:34:41 UTC 2025
Parav Pandit <parav at nvidia.com> writes:
>> From: Eric W. Biederman <ebiederm at xmission.com>
>> Sent: Monday, April 28, 2025 10:34 PM
>
> [..]
>> > I said "user_ns of the netns"? Credentials of the process is
>> > something else?
>>
>> Exactly the credentials of the a process are not:
>> current->nsproxy->net_ns->user_ns; /* Not this */
>>
>> The credentials of a process are:
>> current->cred; /* This */
>>
>> With current->cred->user_ns the current processes user namespace.
>>
> I am confused with your above response.
> In response [1], you described that net ns is the resource,
> hence resource's user namespace is considered.
> And your response [1] also aligns to existing code of [2] and many similar conversions done by your commit 276996fda0f33.
>
> [1] https://lore.kernel.org/linux-rdma/87ikmnd3j6.fsf@email.froward.int.ebiederm.org/T/#me5983d8248de0ff9670644c57d71009debaedd6f
> [2] https://elixir.bootlin.com/linux/v6.14.3/source/net/ipv4/af_inet.c#L314
>
> So in infiniband, when I replace existing capable() with ns_capable(),
> shouldn't I use current->nsproxy->net_ns->user_ns following [1] and
> [2], because for infiniband too, the resource is net namespace.
Almost.
It is true that current->nsproxy->net_ns matches ib_device->net_ns at
open time, but those permission checks don't happen at open time.
After open time you want ib_device->net_ns. Not
current->nsproxy->net_ns.
At which point your ns_capable call will look something like:
ns_capable(ib_device->net_ns->user_ns, CAP_NET_RAW);
That ns_capable call will then check
ib_device->net_ns->user_ns against
current->cred->user_ns.
And it will verify that CAP_NET_RAW is in
current->cred->cap_effect.
Thus checking the resource (the ib_device) against the current
process's credentials.
----
The danger of using current->nsproxy->net_ns->user ns after
open time is the caller may have done.
unshare(CLONE_NEWUSER);
unshare(CLONE_NEWNET);
At which point
"ns_capable(current->nsproxy->net_ns->user_ns, CAP_NET_RAW)"
is guaranteed to be true.
But it isn't meaningful because there are be no ib_devices in that
network namespace.
----
Because of the shared device stuff a relaxed permission check
would actually need to look more like.
struct user_ns *user_ns = shared ? &init_user_ns : ib_device->net_ns->user_ns;
ns_capable(user_ns, CAP_NET_RAW);
This allows sharing the capable call for better maintenance but only
relaxing the permission check for the other cases.
Eric
More information about the Linux-security-module-archive
mailing list