SGX vs LSM (Re: [PATCH v20 00/28] Intel SGX1 support)

Sun May 26 06:09:38 UTC 2019

> From: Andy Lutomirski [mailto:luto at kernel.org]
> Sent: Saturday, May 25, 2019 5:58 PM
> 
> On Sat, May 25, 2019 at 3:40 PM Xing, Cedric <cedric.xing at intel.com> wrote:
> >
> > > From: Andy Lutomirski [mailto:luto at amacapital.net]
> > > Sent: Friday, May 24, 2019 4:42 PM
> > >
> > > > On May 24, 2019, at 3:41 PM, Sean Christopherson <sean.j.christopherson at intel.com>
> wrote:
> > > >
> > > >> On Fri, May 24, 2019 at 02:27:34PM -0700, Andy Lutomirski wrote:
> > > >> On Fri, May 24, 2019 at 1:03 PM Sean Christopherson
> > > >> <sean.j.christopherson at intel.com> wrote:
> > > >>>
> > > >>>> On Fri, May 24, 2019 at 12:37:44PM -0700, Andy Lutomirski wrote:
> > > >>>>> On Fri, May 24, 2019 at 11:34 AM Xing, Cedric <cedric.xing at intel.com> wrote:
> > > >>>>>
> > > >>>>> If "initial permissions" for enclaves are less restrictive
> > > >>>>> than shared objects, then it'd become a backdoor for
> > > >>>>> circumventing LSM when enclave whitelisting is *not* in place.
> > > >>>>> For example, an adversary may load a page, which would
> > > >>>>> otherwise never be executable, as an executable
> > > page in EPC.
> > > >>>>>
> > > >>>>> In the case a RWX page is needed, the calling process has to
> > > >>>>> have a RWX page serving as the source for EADD so
> > > >>>>> PROCESS__EXECMEM will have been checked. For SGX2, changing an
> > > >>>>> EPC page to RWX is subject to FILE__EXECMEM on
> > > >>>>> /dev/sgx/enclave, which I see as a security benefit because it
> > > >>>>> only affects the enclave but not the whole process hosting
> > > it.
> > > >>>>
> > > >>>> So the permission would be like FILE__EXECMOD on the source
> > > >>>> enclave page, because it would be mapped MAP_ANONYMOUS, PROT_WRITE?
> > > >>>> MAP_SHARED, PROT_WRITE isn't going to work because that means
> > > >>>> you can modify the file.
> > > >>>
> > > >>> Was this in response to Cedric's comment, or to my comment?
> > > >>
> > > >> Yours.  I think that requiring source pages to be actually mapped
> > > >> W is not such a great idea.
> > > >
> > > > I wasn't requiring source pages to be mapped W.  At least I didn't
> > > > intend to require W.  What I was trying to say is that SGX could
> > > > trigger an EXECMEM check if userspace attempted to EADD or EAUG an
> > > > enclave page with RWX permissions, e.g.:
> > > >
> > > >  if ((SECINFO.PERMS & RWX) == RWX) {
> > > >      ret = security_mmap_file(NULL, RWX, ???);
> > > >      if (ret)
> > > >          return ret;
> > > >  }
> > > >
> > > > But that's a moot point if we add security_enclave_load() or whatever.
> > > >
> > > >>
> > > >>>
> > > >>>> I'm starting to think that looking at the source VMA permission
> > > >>>> bits or source PTE permission bits is putting a bit too much
> > > >>>> policy into the driver as opposed to the LSM.  How about
> > > >>>> delegating the whole thing to an LSM hook?  The EADD operation
> > > >>>> would invoke a new hook, something like:
> > > >>>>
> > > >>>> int security_enclave_load_bytes(void *source_addr, struct
> > > >>>> vm_area_struct *source_vma, loff_t source_offset, unsigned int
> > > >>>> maxperm);
> > > >>>>
> > > >>>> Then you don't have to muck with mapping anything PROT_EXEC.
> > > >>>> Instead you load from a mapping of a file and the LSM applies
> > > >>>> whatever policy it feels appropriate.  If the first pass gets
> > > >>>> something wrong, the application or library authors can take it
> > > >>>> up with the SELinux folks without breaking the whole ABI :)
> > > >>>>
> > > >>>> (I'm proposing passing in the source_vma because this hook
> > > >>>> would be called with mmap_sem held for read to avoid a TOCTOU
> > > >>>> race.)
> > > >>>>
> > > >>>> If we go this route, the only substantial change to the
> > > >>>> existing driver that's needed for an initial upstream merge is
> > > >>>> the maxperm mechanism and whatever hopefully minimal API
> > > >>>> changes are needed to allow users to conveniently set up the
> > > >>>> mappings.  And we don't need to worry about how to hack around
> > > >>>> mprotect() calling into the LSM, because the LSM will actually
> > > >>>> be aware of SGX and can just do the right thing.
> > > >>>
> > > >>> This doesn't address restricting which processes can run which
> > > >>> enclaves, it only allows restricting the build flow.  Or are you
> > > >>> suggesting this be done in addition to whitelisting sigstructs?
> > > >>
> > > >> In addition.
> > > >>
> > > >> But I named the function badly and gave it a bad signature, which
> > > >> confused you.  Let's try again:
> > > >>
> > > >> int security_enclave_load_from_memory(const struct vm_area_struct
> > > >> *source, unsigned int maxperm);
> > > >
> > > > I prefer security_enclave_load(), "from_memory" seems redundant at best.
> > >
> > > Fine with me.
> >
> > If we think of EADD as a way of mmap()'ing an enclave file into memory, would this
> security_enclave_load() be the same as security_mmap_file(source_vma->vm_file, maxperm,
> MAP_PRIVATE), except that the target is now EPC instead of regular pages?
> 
> Hmm, that's clever.  Although it seems plausible that an LSM would want to allow RX or RWX
> of a given file page but only in the context of an approved enclave, so I think it should
> still be its own hook.

What do you mean by "in the context of an approved enclave"? EPC pages are *inaccessible* to any software until after EINIT. So it would never be a security concern to EADD a page with wrong permissions as long as the enclave would be denied eventually by LSM at EINIT.

But I acknowledge the difference between loading a page into regular memory vs. into EPC. So it's beneficial to have a separate hook, which if not hooked, would pass through to security_mmap_file() by default? 

> 
> >
> > >
> > > >
> > > >> Maybe some really fancy future LSM would also want loff_t
> > > >> source_offset, but it's probably not terribly useful.  This same
> > > >> callback would be used for EAUG.
> >
> > EAUG always zeroes the EPC page before making it available to an enclave. So I don't
> think there's anything needed to done here.
> 
> Duh.  So security_enclave_load_zeros() for EAUG.  See below.
> 
> >
> > > >>
> > > >> Following up on your discussion with Cedric about sigstruct, the
> > > >> other callback would be something like:
> > > >>
> > > >> int security_enclave_init(struct file *sigstruct_file);
> >
> > I'd still insist in using a pointer rather than a file, for reasons that we've discussed
> before. For those who can't recall, the major reason is that most implementation would
> embed SIGSTRUCT into the same file as the enclave (or at least I don't want to prevent
> anyone from doing so), which could also be part of another file, such as a shared object
> or even the main executable itself. It could be difficult to obtain a fd in those cases.
> memfd won't work because it can't retain the same attributes of the original file
> containing the SIGSTRUCT.
> >
> > After all, what matters is the attributes associated with the backing file, which could
> be easily retrieve from vm_file of the covering VMA. So for the sake of flexibility, let's
> stay with what we've agreed before - a pointer to SIGSTRUCT.
> 
> I'm okay with this, except for one nastiness: there's a big difference between a file that
> is just a sigstruct and a file that contains essentially arbitrary data plus a sigstruct
> at an arbitrary offset.
> We could do something tricky like saying that SIGSTRUCT can be in a file that's just a
> SIGSTRUCT or it can be in a special SIGSTRUCT ELF note in a file that isn't just a
> SIGSTRUCT, but that could be annoyingly restrictive.

Agreed. Approving a file implies approving all SIGSTRUCTs within that file. But I guess it wouldn't cause practical problems.

> 
> If it's going to be in an arbitrary file, then I think the signature needs to be more like:
> 
> int security_enclave_init(struct vm_area_struct *sigstruct_vma, loff_t sigstruct_offset,
> const sgx_sigstruct *sigstruct);
> 
> So that the LSM still has the opportunity to base its decision on the contents of the
> SIGSTRUCT.  Actually, we need that change regardless.

Wouldn't the pair of { sigstruct_vma, sigstruct_offset } be the same as just a pointer, because the VMA could be looked up using the pointer and the offset would then be (pointer - vma->vm_start)?

> 
> >
> > > >>
> > > >> The main issue I see is that we also want to control the
> > > >> enclave's ability to have RWX pages or to change a W page to X.
> > > >> We might also
> > > >> want:
> > > >>
> > > >> int security_enclave_load_zeros(unsigned int maxperm);
> > > >
> > > > What's the use case for this?  @maxperm will always be at least RW
> > > > in this case, otherwise the page is useless to the enclave, and if
> > > > the enclave can write the page, the fact that it started as zeros
> > > > is irrelevant.
> > >
> > > This is how EAUG could ask if RWX is okay. If an enclave is
> > > internally doing dynamic loading, the it will need a heap page with
> > > maxperm = RWX.  (If it’s well designed, it will make it RW and then
> > > RX, either by changing SECINFO or by asking the host to mprotect() it, but it still
> needs the overall RWX mask.).
> >
> > Any new page EAUG'ed will start in RW (as dictated by SGX ISA). EACCEPTCOPY will then
> change it to RX. RWX is never needed for all practical purposes. This in fact could be
> gated by mprotect() and the attributes associated with /dev/sgx/enclave. In the case of
> SELinux, FILE__EXECMOD is the right attribute and mprotect() will take care of all the
> rest. I don't see why the driver need a role here.
> 
> I find the SDM's discussion of EAUG, EACCEPT, and EACCEPTCOPY to be extremely confusing.
> My copy of the SDM has EACCEPT's SECINFO argument as "Read access permitted by Non
> Enclave".  Is that an error?

I'm confused by those descriptions too. Guess I cannot comment if that's an error or not.

Anyway, per our internal documents, for EAUG, SECINFO has to be set to PT_REG|RW. For EACCEPT, SGX ISA compares supplied SECINFO with EPCM attributes and returns an error if they don't match. EACCEPTCOPY only works on pending pages (i.e. SECINFO.P must be set), and sets EPCM access permissions to whatever supplied in SECINFO. 

>  And is EACCEPTCOPY just EACCEPT + memcpy or is there some other fundamental difference?
> 38.5.7 doesn't even mention EACCEPTCOPY.

2 differences: 1) EACCEPT only *compares* but EACCEPTCOPY *sets* EPCM permissions; and 2) EACCEPTCOPY does EACCEPT+memcpy atomically.

> 
> Anyway, all my confusion aside, I was talking about the page table, not the EPCM.  I think
> the enclave should need permission to write its own content into a page that will ever
> become X, and the enclave's untrusted host library would do this by adding the page with
> MAXPERM=RWX and then mapping/mprotecting it as PROT_WRITE and then (later or
> simultaneously) PROT_EXEC.

I was talking about the same thing. A code page in EPC will start in RW (both EPCM and PTE) and end in RX (both EPCM and PTE). EACCEPTCOPY takes care of EPCM, while mprotect() could take care of PTE as long as /dev/sgx/enclave has FILE__EXECMOD.

I understand your intention to enclave pages to segments with different MAXPERMs. My concern is though the host process may not always have a priori knowledge on which ranges to be used as code vs. data. After all, only the weakest link matters in security so I think what a host process cares would be whether the enclave loads code dynamically, or expands its data segments only, or neither. And for that reason, I think it more "user friendly" to keep just one MAXPERM - i.e. the most permissive one. Then we could associate that with /dev/sgx/enclave so as to relieve the driver from keeping track of too many things. 

> 
> Since SGX2 doesn't seem to have a way to add an initialized page to EPC after an enclave
> starts, I guess that it's impossible to have the enclave do something like dlopen()
> without MAXPERM=RWX.  So be it.

That's true. The reason behind it is SGX doesn’t trust anything from outside. So non-predetermined contents must be measured (e.g. EADD+EEXTEND), but it's hard to measure (or attest to) dynamically added contents so we decided to allow predetermined contents (i.e. all zeros in the case of EAUG) only.

> Maybe someone will find this annoying someday and SGX3 will add EAUG-but-don't-zero and
> EACCEPT-with-existing-contents.

>From security perspective, accepting a page that is measured/hashed to XYZ is equivalent to overwriting that page with content hashed to XYZ. EACCEPTCOPY actually does the latter. The annoying part is due to the mismatch between SGX ISA and the s/w model adopted by LSM, but that has nothing to do with security. 

> 
> >
> > >
> > > Also, do real SGX1 enclave formats have BSS? If so, then either we
> > > need an ioctl or load zeros or user code is going to load from
> > > /dev/zero or just from the heap, but the LSM is going to play better
> > > with an ioctl, I suspect :)
> >
> > Yes, it does. But an enclave would either measure BSS, in which case the initial bytes
> have to be zero or MRENCLAVE will change; or zero BSS explicitly in its initialization
> code.
> >
> > But from LSM's perspective it makes no difference than EADD'ing a page with non-zero
> content. And security_enclave_load(NULL, RW) would take care of it in exactly in the same
> way.
> 
> Sure, I suppose the same hook with NULL parameters would be equivalent.
> 
> >
> > >
> > > >
> > > >> An enclave that's going to modify its own code will need memory
> > > >> with maxperm = RWX or WX.
> >
> > With SGX2/EDMM, RWX is *never* needed for all practical purposes.
> >
> > In theory, in terms of security, no page shall be made executable while it is still
> being prepared. So W and X shall always be mutually exclusive, regardless it's in EPC or
> regular memory.
> >
> > RWX is only needed in SGX1, as a workaround for certain usages, because EPCM permissions
> can never change at runtime.
> 
> As above, I think I disagree.  MAXPERM is intended as an upper bound on the permissions
> that a page can ever have, at least until it's EREMOVEd and re-added.  Since there's no
> EAUG-but-don't-zero, EAUG with MAXPERM.W=0 is basically useless because the page can never
> contain anything other than zeros, so a dynamically allocated page that is ever executed
> has to have MAXPERM=RWX or MAXPERM=WX.  And that will need special permissions, which I
> think is consistent with your recent emails on how this could all map to SELinux
> permissions.

I'm totally with you. What I was trying to say was that only W or X would be needed at any given time. That said, MAXPERM=RWX but PROCESS__EXECMEM will not be needed, while FILE__EXECMOD will be needed only on /dev/sgx/enclave. So the inherent risk is contained. 

> 
> >
> > > >>
> > > >> But this is a bit awkward if the LSM's decision depends on the
> > > >> sigstruct.  We could get fancy and require that the sigstruct be
> > > >> supplied before any EADD operations so that the maxperm decisions
> > > >> can depend on the sigstruct.
> > > >>
> > > >> Am I making more sense now?
> > > >
> > > > Yep.  Requiring .sigstruct at ECREATE would be trivial.  If we
> > > > wanted flexibility we could do:
> > > >
> > > >   int security_enclave_load(struct file *file, struct vm_area_struct *vma,
> > > >                             unsigned long prot);
> > > >
> > > > And for ultimate flexibility we could pass both .sigstruct and the
> > > > file pointer for /dev/sgx/enclave, but that seems a bit ridiculous.
> > >
> > > I agree.
> >
> > Loosely speaking, an enclave (including initial contents of all of its pages and their
> permissions) and its MRENCLAVE are a 1-to-1 correspondence (given the collision resistant
> property of SHA-2). So only one is needed for a decision, and either one would lead to the
> same decision. So I don't see anything making any sense here.
> >
> > Theoretically speaking, if LSM can make a decision at EINIT by means of
> security_enclave_load(), then security_enclave_load() is never needed.
> >
> > In practice, I support keeping both because security_enclave_load() can only approve an
> enumerable set while security_enclave_load() can approve a non-enumerable set of enclaves.
> Moreover, in order to determine the validity of a MRENCLAVE (as in development of a policy
> or in creation of a white/black list), system admins will need the audit log produced by
> security_enclave_load().
> 
> I'm confused.  Things like MRSIGNER aren't known until the SIGSTRUCT shows up.  Also,
> security_enclave_load() provides no protection against loading a mishmash of two different
> enclave files.  I see
> security_enclave_init() as "verify this SIGSTRUCT against your policy on who may sign
> enclaves and/or grant EXECMOD depending on SIGSTRUCT"
> and security_enclave_load() as "implement your EXECMOD / EXECUTE / WRITE / whatever policy
> and possibly check enclave files for some label."

Sorry for the confusion. I was saying the same thing except that the decision of security_enclave_load() doesn't have to depend on SIGSTRUCT. Given your prototype of security_enclave_load(), I think we are on the same page. I made the above comment to object to the idea of "require that the sigstruct be supplied before any EADD operations so that the maxperm decisions can depend on the sigstruct".

> 
> >
> > >
> > > >
> > > > Passing both would allow tying EXECMOD to /dev/sgx/enclave as
> > > > Cedric wanted (without having to play games and pass
> > > > /dev/sgx/enclave to security_enclave_load()), but I don't think
> > > > there's anything fundamentally broken with using .sigstruct for
> > > > EXECMOD.  It requires more verbose labeling, but that's not a bad thing.
> > >
> > > The benefit of putting it on .sigstruct is that it can be per-enclave.
> > >
> > > As I understand it from Fedora packaging, the way this works on
> > > distros is generally that a package will include some files and
> > > their associated labels, and, if the package needs EXECMOD, then the
> > > files are labeled with EXECMOD and the author of the relevant code might get a dirty
> look.
> > >
> > > This could translate to the author of an exclave that needs RWX
> > > regions getting a dirty look without leaking this permission into other enclaves.
> > >
> > > (In my opinion, the dirty looks are actually the best security
> > > benefit of the entire concept of LSMs making RWX difficult.  A
> > > sufficiently creative attacker can almost always bypass W^X
> > > restrictions once they’ve pwned you, but W^X makes it harder to pwn
> > > you in the first place, and SELinux makes it really obvious when
> > > packaging a program that doesn’t respect W^X.  The upshot is that a
> > > lot of programs got fixed.)
> >
> > I'm lost here. Dynamically linked enclaves, if running on SGX2, would need RW->RX, i.e.
> FILE__EXECMOD on /dev/sgx/enclave. But they never need RWX, i.e. PROCESS__EXECMEM.
> 
> Hmm.  If we want to make this distinction, we need something a big richer than my proposed
> callbacks.  A check of the actual mprotect() /
> mmap() permissions would also be needed.  Specifically, allowing MAXPERM=RWX wouldn't
> imply that PROT_WRITE | PROT_EXEC is allowed.

If we keep only one MAXPERM, wouldn't this be the current behavior of mmap()/mprotect()?

To be a bit more clear, system admin sets MAXPERM upper bound in the form of FILE__{READ|WRITE|EXECUTE|EXECMOD} of /dev/sgx/enclave. Then for a process/enclave, if what it requires falls below what's allowed on /dev/sgx/enclave, then everything will just work. Otherwise, it fails in the form of -EPERM returned from mmap()/mprotect(). Please note that MAXPERM here applies to "runtime" permissions, while "initial" permissions are taken care of by security_enclave_{load|init}. "initial" permissions could be more permissive than "runtime" permissions, e.g., RX is still required for initial code pages even though system admins could disable dynamically loaded code pages by *not* giving FILE__{EXECUTE|EXECMOD}. Therefore, the "initial" mapping would still have to be done by the driver (to bypass LSM), either via a new ioctl or as part of IOC_EINIT.