[RFC PATCH v1 0/9] Hypervisor-Enforced Kernel Integrity

Mickaël Salaün mic at digikod.net
Fri May 26 15:22:03 UTC 2023


On 25/05/2023 15:59, Mickaël Salaün wrote:
> 
> On 25/05/2023 00:20, Edgecombe, Rick P wrote:
>> On Fri, 2023-05-05 at 17:20 +0200, Mickaël Salaün wrote:
>>> # How does it work?
>>>
>>> This implementation mainly leverages KVM capabilities to control the
>>> Second
>>> Layer Address Translation (or the Two Dimensional Paging e.g.,
>>> Intel's EPT or
>>> AMD's RVI/NPT) and Mode Based Execution Control (Intel's MBEC)
>>> introduced with
>>> the Kaby Lake (7th generation) architecture. This allows to set
>>> permissions on
>>> memory pages in a complementary way to the guest kernel's managed
>>> memory
>>> permissions. Once these permissions are set, they are locked and
>>> there is no
>>> way back.
>>>
>>> A first KVM_HC_LOCK_MEM_PAGE_RANGES hypercall enables the guest
>>> kernel to lock
>>> a set of its memory page ranges with either the HEKI_ATTR_MEM_NOWRITE
>>> or the
>>> HEKI_ATTR_MEM_EXEC attribute. The first one denies write access to a
>>> specific
>>> set of pages (allow-list approach), and the second only allows kernel
>>> execution
>>> for a set of pages (deny-list approach).
>>>
>>> The current implementation sets the whole kernel's .rodata (i.e., any
>>> const or
>>> __ro_after_init variables, which includes critical security data such
>>> as LSM
>>> parameters) and .text sections as non-writable, and the .text section
>>> is the
>>> only one where kernel execution is allowed. This is possible thanks
>>> to the new
>>> MBEC support also brough by this series (otherwise the vDSO would
>>> have to be
>>> executable). Thanks to this hardware support (VT-x, EPT and MBEC),
>>> the
>>> performance impact of such guest protection is negligible.
>>>
>>> The second KVM_HC_LOCK_CR_UPDATE hypercall enables guests to pin some
>>> of its
>>> CPU control register flags (e.g., X86_CR0_WP, X86_CR4_SMEP,
>>> X86_CR4_SMAP),
>>> which is another complementary hardening mechanism.
>>>
>>> Heki can be enabled with the heki=1 boot command argument.
>>>
>>>
>>
>> Can the guest kernel ask the host VMM's emulated devices to DMA into
>> the protected data? It should go through the host userspace mappings I
>> think, which don't care about EPT permissions. Or did I miss where you
>> are protecting that another way? There are a lot of easy ways to ask
>> the host to write to guest memory that don't involve the EPT. You
>> probably need to protect the host userspace mappings, and also the
>> places in KVM that kmap a GPA provided by the guest.
> 
> Good point, I'll check this confused deputy attack. Extended KVM
> protections should indeed handle all ways to map guests' memory. I'm
> wondering if current VMMs would gracefully handle such new restrictions
> though.

I guess the host could map arbitrary data to the guest, so that need to 
be handled, but how could the VMM (not the host kernel) bypass/update 
EPT initially used for the guest (and potentially later mapped to the host)?



More information about the Linux-security-module-archive mailing list