[PATCH] RDMA/uverbs: Consider capability of the process that opens the file

Serge E. Hallyn serge at hallyn.com
Mon Apr 21 17:22:36 UTC 2025


On Mon, Apr 21, 2025 at 01:33:45PM +0000, Parav Pandit wrote:
> 
> 
> > From: Serge E. Hallyn <serge at hallyn.com>
> > Sent: Monday, April 21, 2025 6:30 PM
> > 
> > On Mon, Apr 21, 2025 at 11:04:57AM +0000, Parav Pandit wrote:
> > >
> > > > From: Serge E. Hallyn <serge at hallyn.com>
> > > > Sent: Monday, April 21, 2025 8:43 AM
> > > >
> > > > On Fri, Apr 04, 2025 at 02:53:30PM +0000, Parav Pandit wrote:
> > > > > Hi Eric, Jason,
> > > >
> > > > Hi,
> > > >
> > > > I'm jumping back up the thread as I think this email best details
> > > > the things I'm confused about :)  Three questions below in two different
> > stanzas.
> > > >
> > > > > To summarize,
> > > > >
> > > > > 1. A process can open an RDMA resource (such as a raw QP, raw flow
> > > > > entry, or similar 'raw' resource) through the fd using ioctl(), if
> > > > > it has the
> > > > appropriate capability, which in this case is CAP_NET_RAW.
> > > >
> > > > Why does it need CAP_NET_RAW to create the resource, if the resource
> > > > won't be usable by a process without CAP_NET_RAW later anyway?
> > > Once the resource is created, and the fd is shared (like a raw socket fd), it
> > will be usable by a process without CAP_NET_RAW.
> > > Is that a concern? If yes, how is it solved for raw socket fd? It appears to me
> > that it is not.
> > >
> > > > Is that legacy
> > > > for the read/write (vs ioctl) case?
> > > No.
> > >
> > > > Or is it to limit the number of opened resources?  Or some other
> > > > reason?
> > > >
> > > The resource enables to do raw operation, hence the capability check of the
> > process for having NET_RAW cap.
> > 
> > Ok, so it seems to me that
> > 
> > 1. the create should check ns_capable(current->nsproxy->net->user_ns,
> > CAP_NET_RAW) 
> I believe this is sufficient as this create call happens through the ioctl().
> But more question on #3.
> 
> > 2. the read/write are a known escape, eventually to be
> > removed?
> Write should be deprecated eventually.
> Jason mentioned that write() can be compiled out of kernel.
> I guess it needs new compile time config flag around [1].
> 
> [1] https://elixir.bootlin.com/linux/v6.14.3/source/drivers/infiniband/core/uverbs_main.c#L1037
> 
> > 3. the ioctl should check file_ns_capable(attrs->ufile->filp->f_cred->user_ns,
> > CAP_NET_RAW)
> > 
> > Two notes about (3).  First, note that it's different from what you had.
> > It explicitly checks that the caller has CAP_NET_RAW against the net
> > namespace that was used to open the file.  
> How is the net namespace linked in #3?
> Is it because when file was opened, the rdma device was accessible in a given net ns?
> But again the net ns explicitly not accessed in #3.

I'll have to look around and see if we can deduce the netns from elsewhere,
the device perhaps.  But IIUC the file's user_ns should be the one for
which we checked that it has CAP_NET_RAW over the actual net->user_ns,
so if you have CAP_NET_RAW in that user_ns, then you're good.  Where it
*could* get wonky is if the opener was in a parent userns of the net->userns.
In that case the file's userns will be sufficient to access the net, but
we could end up denying access from a privileged process in its child
user_ns, that is, potentially, the net->userns.

> > Second, I'm suggesting this
> > because Jason does keep saying that ioctl is supposed to solve the missing
> > permission check.  
> I don't understand how ioctl() is replacement to capability ns_capable() check.

I'm assuming the ioctl system call handler does the check.  I'll double-check.

> Do you mean to delete the capable() call itself?
> I likely misunderstood..
> 
> > If it really is felt that no permission check should be
> > needed, that's a different discussion.  I've just been trying to figure out where
> > the state should be tracked.
> > 
> > -serge



More information about the Linux-security-module-archive mailing list