[PATCH] RDMA/uverbs: Fix CAP_NET_RAW check for flow create in user namespace
Parav Pandit
parav at nvidia.com
Mon Mar 10 14:47:53 UTC 2025
Hi,
> From: Serge E. Hallyn <serge at hallyn.com>
> Sent: Monday, March 10, 2025 7:01 PM
>
> On Sat, Mar 08, 2025 at 08:06:02PM +0200, Parav Pandit wrote:
> > A process running in a non-init user namespace possesses the
> > CAP_NET_RAW capability. However, the patch cited in the fixes tag
> > checks the capability in the default init user namespace.
> > Because of this, when the process was started by Podman in a
> > non-default user namespace, the flow creation failed.
> >
> > Fix this issue by checking the CAP_NET_RAW networking capability in
> > the owner user namespace that created the network namespace.
>
> Hi,
>
> you say
>
> > Fix this issue by checking the CAP_NET_RAW networking capability > in the
> owner user namespace that created the network namespace.
>
> But in fact you are checking the CAP_NET_RAW against the user's network
> namespace.
I didn't understand your comment.
The fix takes the current process's network namespace by referring to current->nsproxy->net_ns.
Each net ns has its owning user namespace who has created it.
So the patch is checking caps in the such user namespace.
Can you please elaborate?
> That is usually not the same thing, although it is possible that in
> this case it is.
>
> What is cmd.flow_id? Is that guaranteed to represent something in the
> current process' network namespace?
The flow_id is namespaced in the rdma device. A network namespace can have multiple devices.
So it cannot be unique because an rdma device in different namespace may have same flow_id.
> Or is it possible that a user without
> privilege in his user namespace could unshare userns+netns but then cause
> this fn to be called against a flow in the original network namespace?
>
It can do unshare user_ns+netns, but
I fail to see it can refer to the original (previous net_ns) in this call, because this call is always refers to current->nsproxy->net_us, which means it operates on the latest net ns (after unshare).
> >
> > This change is similar to the following cited patches.
> >
> > commit 5e1fccc0bfac ("net: Allow userns root control of the core of
> > the network stack.") commit 52e804c6dfaa ("net: Allow userns root to
> > control ipv4") commit 59cd7377660a ("net: openvswitch: allow conntrack
> > in non-initial user namespace") commit 0a3deb11858a ("fs: Allow
> > listmount() in foreign mount namespace") commit dd7cb142f467 ("fs:
> > relax permissions for listmount()")
> >
> > Fixes: c938a616aadb ("IB/core: Add raw packet QP type")
> > Signed-off-by: Parav Pandit <parav at nvidia.com>
> >
> > ---
> > I would like to have feedback from the LSM experts to make sure this
> > fix is correct. Given the widespread usage of the capable() call, it
> > makes me wonder if the patch right.
> >
> > Secondly, I wasn't able to determine which primary namespace (such as
> > mount or IPC, etc.) to consider for the CAP_IPC_LOCK capability.
> > (not directly related to this patch, but as concept)
> > ---
> > drivers/infiniband/core/uverbs_cmd.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/infiniband/core/uverbs_cmd.c
> > b/drivers/infiniband/core/uverbs_cmd.c
> > index 5ad14c39d48c..8d6615f390f5 100644
> > --- a/drivers/infiniband/core/uverbs_cmd.c
> > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > @@ -3198,7 +3198,7 @@ static int ib_uverbs_ex_create_flow(struct
> uverbs_attr_bundle *attrs)
> > if (cmd.comp_mask)
> > return -EINVAL;
> >
> > - if (!capable(CAP_NET_RAW))
> > + if (!ns_capable(current->nsproxy->net_ns->user_ns, CAP_NET_RAW))
> > return -EPERM;
> >
> > if (cmd.flow_attr.flags >= IB_FLOW_ATTR_FLAGS_RESERVED)
> > --
> > 2.26.2
> >
More information about the Linux-security-module-archive
mailing list