[PATCH v7 10/14] securityfs: Extend securityfs with namespacing support

Thu Dec 16 13:40:27 UTC 2021

On Thu, Dec 16, 2021 at 12:43:19AM -0500, Stefan Berger wrote:
> From: Stefan Berger <stefanb at linux.ibm.com>
> 
> Extend 'securityfs' for support of IMA namespacing so that each
> IMA (user) namespace can have its own front-end for showing the currently
> active policy, the measurement list, number of violations and so on.
> 
> Drop the addition dentry reference to enable simple cleanup of dentries
> upon umount.
> 
> Prevent mounting of an instance of securityfs in another user namespace
> than it belongs to. Also, prevent accesses to directories when another
> user namespace is active than the one that the instance of securityfs
> belongs to.
> 
> Signed-off-by: Stefan Berger <stefanb at linux.ibm.com>
> Signed-off-by: James Bottomley <James.Bottomley at HansenPartnership.com>
> ---
>  security/inode.c | 37 ++++++++++++++++++++++++++++++++++---
>  1 file changed, 34 insertions(+), 3 deletions(-)
> 
> diff --git a/security/inode.c b/security/inode.c
> index fee01ff4d831..a0d9f086e3d5 100644
> --- a/security/inode.c
> +++ b/security/inode.c
> @@ -26,6 +26,29 @@
>  static struct vfsmount *init_securityfs_mount;
>  static int init_securityfs_mount_count;
>  
> +static int securityfs_permission(struct user_namespace *mnt_userns,
> +				 struct inode *inode, int mask)
> +{
> +	int err;
> +
> +	err = generic_permission(&init_user_ns, inode, mask);
> +	if (!err) {
> +		if (inode->i_sb->s_user_ns != current_user_ns())
> +			err = -EACCES;

I really think the correct semantics is to grant all callers access
whose user namespace is the same as or an ancestor of the securityfs
userns. It's weird to deny access to callers who are located in an
ancestor userns.

For example, a privileged process on the host should be allowed to setns
to the userns of an unprivileged container and inspect its securityfs
instance.

We're mostly interested to block such as scenarios where two sibling
unprivileged containers are created in the initial userns and an fd
proxy or something funnels a file descriptor from one sibling container
to the another one and the receiving sibling container can use readdir()
or openat() on this fd. (I'm not even convinced that this is actually a
problem but stricter semantics at the beginning can't hurt. We can
always relax this later.)