[PATCH v5 2/5] userfaultfd: add /dev/userfaultfd for fine grained access control

Mike Rapoport rppt at kernel.org
Thu Aug 11 06:37:25 UTC 2022


On Mon, Aug 08, 2022 at 10:56:11AM -0700, Axel Rasmussen wrote:
> Historically, it has been shown that intercepting kernel faults with
> userfaultfd (thereby forcing the kernel to wait for an arbitrary amount
> of time) can be exploited, or at least can make some kinds of exploits
> easier. So, in 37cd0575b8 "userfaultfd: add UFFD_USER_MODE_ONLY" we
> changed things so, in order for kernel faults to be handled by
> userfaultfd, either the process needs CAP_SYS_PTRACE, or this sysctl
> must be configured so that any unprivileged user can do it.
> 
> In a typical implementation of a hypervisor with live migration (take
> QEMU/KVM as one such example), we do indeed need to be able to handle
> kernel faults. But, both options above are less than ideal:
> 
> - Toggling the sysctl increases attack surface by allowing any
>   unprivileged user to do it.
> 
> - Granting the live migration process CAP_SYS_PTRACE gives it this
>   ability, but *also* the ability to "observe and control the
>   execution of another process [...], and examine and change [its]
>   memory and registers" (from ptrace(2)). This isn't something we need
>   or want to be able to do, so granting this permission violates the
>   "principle of least privilege".
> 
> This is all a long winded way to say: we want a more fine-grained way to
> grant access to userfaultfd, without granting other additional
> permissions at the same time.
> 
> To achieve this, add a /dev/userfaultfd misc device. This device
> provides an alternative to the userfaultfd(2) syscall for the creation
> of new userfaultfds. The idea is, any userfaultfds created this way will
> be able to handle kernel faults, without the caller having any special
> capabilities. Access to this mechanism is instead restricted using e.g.
> standard filesystem permissions.
> 
> Acked-by: Nadav Amit <namit at vmware.com>
> Acked-by: Peter Xu <peterx at redhat.com>
> Signed-off-by: Axel Rasmussen <axelrasmussen at google.com>

Acked-by: Mike Rapoport <rppt at linux.ibm.com>

> ---
>  fs/userfaultfd.c                 | 73 +++++++++++++++++++++++++-------
>  include/uapi/linux/userfaultfd.h |  4 ++
>  2 files changed, 61 insertions(+), 16 deletions(-)

-- 
Sincerely yours,
Mike.



More information about the Linux-security-module-archive mailing list