[PATCH] RDMA/uverbs: Consider capability of the process that opens the file

Parav Pandit parav at nvidia.com
Wed Apr 23 15:56:39 UTC 2025


> From: Eric W. Biederman <ebiederm at xmission.com>
> Sent: Wednesday, April 23, 2025 9:14 PM
> 
> Jason Gunthorpe <jgg at nvidia.com> writes:
> 
> > On Wed, Apr 23, 2025 at 12:41:26PM +0000, Parav Pandit wrote:
> >>
> >> > From: Serge E. Hallyn <serge at hallyn.com>
> >> > Sent: Tuesday, April 22, 2025 10:00 PM
> >> >
> >> > On Tue, Apr 22, 2025 at 01:11:27PM -0300, Jason Gunthorpe wrote:
> >> > > On Tue, Apr 22, 2025 at 08:14:33AM -0500, Serge E. Hallyn wrote:
> >> > > > Hi Jason,
> >> > > >
> >> > > > On Tue, Apr 22, 2025 at 09:46:40AM -0300, Jason Gunthorpe wrote:
> >> > > > > On Mon, Apr 21, 2025 at 12:22:36PM -0500, Serge E. Hallyn wrote:
> >> > > > > > > > 1. the create should check
> >> > > > > > > > ns_capable(current->nsproxy->net->user_ns,
> >> > > > > > > > CAP_NET_RAW)
> >> > > > > > > I believe this is sufficient as this create call happens
> >> > > > > > > through the
> >> > ioctl().
> >> > > > > > > But more question on #3.
> >> > > > >
> >> > > > > I think this is the right one to use everywhere.
> >> > > >
> >> > > > It's the right one to use when creating resources, but when
> >> > > > later using them, since below you say that the resource should
> >> > > > in fact be tied to the creator's network namespace, that means
> >> > > > that checking
> >> > > > current->nsproxy->net->user_ns would have nothing to do with
> >> > > > current->nsproxy->net->the
> >> > > > resource being used, right?
> >> > >
> >> > > Yes, in that case you'd check something stored in the uobject.
> >> >
> >> > Perfect, that's exactly the kind of thing I was looking for.  Thanks.
> >> >
> >> It means uboject create path will refcount and store user_ns,
> >>
> >> uobject->user_ns = get_user_ns(current->nsproxy->net->user_ns);
> >>
> >> And uobject destroy will do,
> >> 	put_user_ns(uobject->user_ns).
> >>
> >> This will ensure that in below flow we won't have use_after_free.
> >> 1. process_A created object in user_ns_A 2. process_A shared fd with
> >> process_B in user_ns_B 3. process_A is killed and 4. user_ns_A is
> >> free is attempted (free is skipped, until uobject is destroyed by process_B).
> >
> > We only need to do that if something is legimitately doing capable
> > from a uobject outside of creation? Did you find that?
> 
> I believe the proposed change that started this discussion, was to make rdma
> usable inside of a user namespace.
> 
Answering both you and Jason above question.
It will be only specific uobjects which demands the CAP_X will have user_ns reference.
Rest continue as_is.

> Which led to the question: Are the current capable calls safe and correct, as
> they aren't preserving the context that can with opening a file descriptor?  If I
> have skimmed this thread correctly the answer not preserving the opener's
> context is a seriously atypical but deliberate choice.
> 
> > And I wonder if using the uobjects affiliated netdev's namespace is
> > OK?
>
We don't refer to the netdev of the rdma. Because netdev is not there in many cases.
Its just rdma device.

We always refer to the net ns of the currently running process.
And a process is able to access the rdma because the process and rdma device are in same net ns.


> That is actually preferable.  It is what I updated the rest of the network stack
> to do.  I don't know if you would use dev_net or something else.
>
Current->nxproxy->net->user_ns should be sufficient to refer as both the device and process are in same net ns.
Moreover it's the capability of the process being accessed, not really the rdma device.
 
> Going back to the original proposal I don't know how ready the code is to
> handle callers that are not root.  This is both a question of semantics (is it safe
> in theory) and a question of implementation (are there unfixed bugs that no
> one cares about because only root has been using the code).
> 
> Eric



More information about the Linux-security-module-archive mailing list