[PATCH] RDMA/uverbs: Consider capability of the process that opens the file

Parav Pandit parav at nvidia.com
Sun Apr 20 17:31:34 UTC 2025



> From: Serge E. Hallyn <serge at hallyn.com>
> Sent: Sunday, April 20, 2025 7:12 PM
> 
> On Sun, Apr 20, 2025 at 12:30:37PM +0000, Parav Pandit wrote:
> >
> >
> > > From: sergeh at kernel.org <sergeh at kernel.org>
> > > Sent: Monday, April 7, 2025 8:17 PM
> > >
> > > On Mon, Apr 07, 2025 at 11:16:35AM +0000, Parav Pandit wrote:
> > > > > From: Serge E. Hallyn <serge at hallyn.com>
> > > > > Sent: Sunday, April 6, 2025 7:45 PM
> > > > >
> > > > > On Fri, Apr 04, 2025 at 12:13:47PM -0300, Jason Gunthorpe wrote:
> > > > > > On Fri, Apr 04, 2025 at 02:53:30PM +0000, Parav Pandit wrote:
> > > > > > > To summarize,
> > > > > > >
> > > > > > > 1. A process can open an RDMA resource (such as a raw QP,
> > > > > > > raw flow entry, or similar 'raw' resource) through the fd
> > > > > > > using ioctl(), if it has the
> > > > > appropriate capability, which in this case is CAP_NET_RAW.
> > > > > > > This is similar to a process that opens a raw socket.
> > > > > > >
> > > > > > > 2. Given that RDMA uses ioctl() for resource creation, there
> > > > > > > isn't a security concern surrounding the read()/write() system calls.
> > > > > > >
> > > > > > > 3. If process A, which does not have CAP_NET_RAW, passes the
> > > > > > > opened fd to another privileged process B, which has
> > > > > > > CAP_NET_RAW, process B
> > > > > can open the raw RDMA resource.
> > > > > > > This is still within the kernel-defined security boundary,
> > > > > > > similar to a raw
> > > > > socket.
> > > > > > >
> > > > > > > 4. If process A, which has the CAP_NET_RAW capability,
> > > > > > > passes the file
> > > > > descriptor to Process B, which does not have CAP_NET_RAW,
> > > > > Process B will not be able to open the raw RDMA resource.
> > > > > > >
> > > > > > > Do we agree on this Eric?
> > > > > >
> > > > > > This is our model, I consider it uAPI, so I don't belive we
> > > > > > can change it without an extreme reason..
> > > > > >
> > > > > > > 5. the process's capability check should be done in the
> > > > > > > right user
> > > > > namespace.
> > > > > > > (instead of current in default user ns).
> > > > > > > The right user namespace is the one which created the net
> namespace.
> > > > > > > This is because rdma networking resources are governed by
> > > > > > > the net
> > > > > namespace.
> > > > > >
> > > > > > This all makes my head hurt. The right user namespace is the
> > > > > > one that is currently active for the invoking process, I
> > > > > > couldn't understand why we have net namespaces refer to user
> > > > > > namespaces :\
> > > > >
> > > > > A user at any time can create a new user namespace, without
> > > > > creating a new network namespace, and have privilege in that
> > > > > user namespace, over resources owned by the user namespace.
> > > > >
> > > >
> > > > > So if a user can create a new user namespace, then say "hey I
> > > > > have CAP_NET_ADMIN over current_user_ns, so give me access to
> > > > > the RDMA resources belonging to my current_net_ns", that's a
> problem.
> > > > >
> > > > > So that's why the check should be
> > > > > ns_capable(device->net->user-ns,
> > > > > CAP_NET_ADMIN) and not ns_capable(current_user_ns,
> > > CAP_NET_ADMIN).
> > > > >
> > > > Given the check is of the process (and hence user and net ns) and
> > > > not of the rdma device itself, Shouldn't we just check,
> > > >
> > > > ns_capable(current->nsproxy->user_ns, ...)
> > > >
> > > > This ensures current network namespace's owning user ns is consulted.
> > >
> > > No, it does not.  If I do
> > >
> > > unshare -U
> > >
> > > then current->nsproxy->user_ns is not my current network namespace's
> > > owning user ns.
> > >
> > It should be current->nsproxy->net->user_ns.
> > This ensures that it is always current network namespace's owning user ns is
> considered.
> > Right?
> >
> 
> Hi,
> 
> That will depend on exactly what you're checking permissions for.
> It looks like ib_uverbs_ex_create_flow() gets passed a uverbs_attr_bundle
> pointer that has a context which holds the thing you're actually checking
> permissions towards?  And I'm assuming that that thing is actually a file?  

File is just a conduit calling into the kernel for a specific device.
And ucontext is first object where other objects are anchored such as flow done using ex_create_flow().
So what we really want to check on the ioctl() is, if the calling process has necessary capability in its user namespace.

> So
> again, if the task can create the "thing" first, then unshare its network
> namespace, then cause this permission to be checked, or if it can accept a file
> over unix socket or whatever that someone else opened, then current-
> >nsproxy->net->user_ns may *not* be relevant.  
As Jason explained, it can receive or it can unshare too.
As long as the process invoking ioctl() has the capability, it should be able to do it.
Jason explained the rdma fd  model in [1] 

[1] https://lore.kernel.org/linux-rdma/20250420134144.GA575032@mail.hallyn.com/T/#m1c84babbc593b255c6a4fdebb6c65651717a75f7

> If, however, the flow, later
> on, will ensure that any actions are only relevant in the current network
> namespace, then you are correct.
> 
Indeed. Flow and its operation are in the net namespace of the process.

> I just can't tell in this flow.  I"ll try to find some time to track it down more.
> 
> -serge



More information about the Linux-security-module-archive mailing list