[RFC v2 00/13] Multi-Key Total Memory Encryption API (MKTME)

Thu Dec 6 19:10:43 UTC 2018

> On Dec 6, 2018, at 7:39 AM, Dave Hansen <dave.hansen at intel.com> wrote:

>>>> the direct map as well, probably using the pageattr.c code.
>>>
>>> The current, public hardware spec has a description of what's required
>>> to maintain cache coherency.  Basically, you can keep as many mappings
>>> of a physical page as you want, but only write to one mapping at a time,
>>> and clflush the old one when you want to write to a new one.
>>
>> Surely you at least have to clflush the old mapping and then the new
>> mapping, since the new mapping could have been speculatively read.
>
> Nope.  The coherency is "fine" unless you have writeback of an older
> cacheline that blows away newer data.  CPUs that support MKTME are
> guaranteed to never do writeback of the lines that could be established
> speculatively or from prefetching.

How is that sufficient?  Suppose I have some physical page mapped with
keys 1 and 2. #1 is logically live and I write to it.  Then I prefetch
or otherwise populate mapping 2 into the cache (in the S state,
presumably).  Now I clflush mapping 1 and read 2.  It contains garbage
in the cache, but the garbage in the cache is inconsistent with the
garbage in memory.  This can’t be a good thing, even if no writeback
occurs.

I suppose the right fix is to clflush the old mapping and then to zero
the new mapping.

>
>>>> Finally, If you're going to teach the kernel how to have some user
>>>> pages that aren't in the direct map, you've essentially done XPO,
>>>> which is nifty but expensive.  And I think that doing this gets you
>>>> essentially all the benefit of MKTME for the non-pmem use case.  Why
>>>> exactly would any software want to use anything other than a
>>>> CPU-managed key for anything other than pmem?
>>>
>>> It is handy, for one, to let you "cluster" key usage.  If you have 5
>>> Pepsi VMs and 5 Coke VMs, each Pepsi one using the same key and each
>>> Coke one using the same key, you can boil it down to only 2 hardware
>>> keyid slots that get used, and do this transparently.
>>
>> I understand this from a marketing perspective but not a security
>> perspective.  Say I'm Coke and you've sold me some VMs that are
>> "encrypted with a Coke-specific key and no other VMs get to use that
>> key."  I can't think of *any* not-exceedingly-contrived attack in
>> which this makes the slightest difference.  If Pepsi tries to attack
>> Coke without MKTME, then they'll either need to get the hypervisor to
>> leak Coke's data through the direct map or they'll have to find some
>> way to corrupt a page table or use something like L1TF to read from a
>> physical address Coke owns.  With MKTME, if they can read through the
>> host direct map, then they'll get Coke's cleartext, and if they can
>> corrupt a page table or use L1TF to read from your memory, they'll get
>> Coke's cleartext.
>
> The design definitely has the hypervisor in the trust boundary.  If the
> hypervisor is evil, or if someone evil compromises the hypervisor, MKTME
> obviously provides less protection.
>
> I guess the question ends up being if this makes its protections weak
> enough that we should not bother merging it in its current form.

Indeed, but I’d ask another question too: I expect that MKTME is weak
enough that it will be improved, and without seeing the improvement,
it seems quite plausible that using the improvement will require
radically reworking the kernel implementation.

As a straw man, suppose we get a way to say “this key may only be
accessed through such-and-such VPID or by using a special new
restricted facility for the hypervisor to request access”.    Now we
have some degree of serious protection, but it doesn’t work, by
design, for anonymous memory.  Similarly, something that looks more
like AMD's SEV would be very very awkward to support with anything
like the current API proposal.

>
> I still have the homework assignment to go figure out why folks want the
> protections as they stand.