[PATCH] RDMA/uverbs: Consider capability of the process that opens the file

Jason Gunthorpe jgg at nvidia.com
Tue Apr 22 16:11:27 UTC 2025


On Tue, Apr 22, 2025 at 08:14:33AM -0500, Serge E. Hallyn wrote:
> Hi Jason,
> 
> On Tue, Apr 22, 2025 at 09:46:40AM -0300, Jason Gunthorpe wrote:
> > On Mon, Apr 21, 2025 at 12:22:36PM -0500, Serge E. Hallyn wrote:
> > > > > 1. the create should check ns_capable(current->nsproxy->net->user_ns,
> > > > > CAP_NET_RAW) 
> > > > I believe this is sufficient as this create call happens through the ioctl().
> > > > But more question on #3.
> > 
> > I think this is the right one to use everywhere.
> 
> It's the right one to use when creating resources, but when later using
> them, since below you say that the resource should in fact be tied to
> the creator's network namespace, that means that checking
> current->nsproxy->net->user_ns would have nothing to do with the
> resource being used, right?

Yes, in that case you'd check something stored in the uobject.

This happens sort of indirectly, for instance an object may become
associated with a netdevice and the netdevice is linked to a net
namespace. Eg we should do route lookups relative to that associated
net devices's namespaces.

I'm not sure we have a capable like check like that though.

> > Even in goofy cases like passing a FD between processes with different
> > net namespaces, the expectation is that objects can be created
> > relative to net namespace of the process calling the ioctl, and then
> > accessed by the other process in the other namespace.
> 
> So when earlier it was said that uverbs was switching from read/write
> to ioctl so that permissions could be checked, that is not actually
> the case? 

I don't quite know what you mean here?

read/write has a security problem in that you can pass a FD to a
setuid program as its stdout and have that setuid program issue a
write() to trigger a kernel operation using it's elevated
privilege. This is not possible with ioctl.

When this bug was discovered the read/write path started calling
ib_safe_file_access() which blanket disallows *any* credential change
from open() to write().

ioctl removes this excessive restriction and we are back to
per-process checks.

> The intent is for a privileged task to create the
> resource and be able to pass it to any task in any namespace with any
> or no privilege and have that task be able to use it with the
> opener's original privilege, just as with read/write?

Yes. The permissions affiliate with the object contained inside the
FD, not the FD itself. The FD is just a container and a way to route
system calls.

> I was trying last night to track down where the uverb ioctls are doing 
> permission checks, but failing to find it.  I see where the
> pbundle->method_elm->handler gets dereferenced, but not where those
> are defined.

There are very few permission checks. Most boil down to implicit
things, like we have a netdevice relative to current's net namespace
and we need to find a gid table index for that netdevice. We don't
actually need to do anything special here as the ifindex code
automatically validates the namespaces and struct net_device * are
globally unique.

Similarly with route lookups and things, once we validated the net
device objects are supposed to remain bound to it.

The cases like cap_net_raw are one time checks at creation time that
modify the devices' rules for processing the queues. The devices check
the creation property of the queue when processing the queue.

Jason



More information about the Linux-security-module-archive mailing list