[PATCH v26 10/22] x86/sgx: Linux Enclave Driver

Andy Lutomirski luto at kernel.org
Thu Feb 20 18:51:37 UTC 2020


On Thu, Feb 20, 2020 at 10:13 AM Sean Christopherson
<sean.j.christopherson at intel.com> wrote:
>
> On Tue, Feb 18, 2020 at 07:26:31PM -0800, Jordan Hand wrote:
> > I ran our validation tests for the Open Enclave SDK against this patch
> > set and came across a potential issue.
> >
> > On 2/9/20 1:25 PM, Jarkko Sakkinen wrote:
> > > +/**
> > > + * sgx_encl_may_map() - Check if a requested VMA mapping is allowed
> > > + * @encl:          an enclave
> > > + * @start:         lower bound of the address range, inclusive
> > > + * @end:           upper bound of the address range, exclusive
> > > + * @vm_prot_bits:  requested protections of the address range
> > > + *
> > > + * Iterate through the enclave pages contained within [@start, @end) to verify
> > > + * the permissions requested by @vm_prot_bits do not exceed that of any enclave
> > > + * page to be mapped.  Page addresses that do not have an associated enclave
> > > + * page are interpreted to zero permissions.
> > > + *
> > > + * Return:
> > > + *   0 on success,
> > > + *   -EACCES if VMA permissions exceed enclave page permissions
> > > + */
> > > +int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
> > > +                unsigned long end, unsigned long vm_prot_bits)
> > > +{
> > > +   unsigned long idx, idx_start, idx_end;
> > > +   struct sgx_encl_page *page;
> > > +
> > > +   /* PROT_NONE always succeeds. */
> > > +   if (!vm_prot_bits)
> > > +           return 0;
> > > +
> > > +   idx_start = PFN_DOWN(start);
> > > +   idx_end = PFN_DOWN(end - 1);
> > > +
> > > +   for (idx = idx_start; idx <= idx_end; ++idx) {
> > > +           mutex_lock(&encl->lock);
> > > +           page = radix_tree_lookup(&encl->page_tree, idx);
> > > +           mutex_unlock(&encl->lock);
> > > +
> > > +           if (!page || (~page->vm_max_prot_bits & vm_prot_bits))
> > > +                   return -EACCES;
> > > +   }
> > > +
> > > +   return 0;
> > > +}
> > > +static struct sgx_encl_page *sgx_encl_page_alloc(struct sgx_encl *encl,
> > > +                                            unsigned long offset,
> > > +                                            u64 secinfo_flags)
> > > +{
> > > +   struct sgx_encl_page *encl_page;
> > > +   unsigned long prot;
> > > +
> > > +   encl_page = kzalloc(sizeof(*encl_page), GFP_KERNEL);
> > > +   if (!encl_page)
> > > +           return ERR_PTR(-ENOMEM);
> > > +
> > > +   encl_page->desc = encl->base + offset;
> > > +   encl_page->encl = encl;
> > > +
> > > +   prot = _calc_vm_trans(secinfo_flags, SGX_SECINFO_R, PROT_READ)  |
> > > +          _calc_vm_trans(secinfo_flags, SGX_SECINFO_W, PROT_WRITE) |
> > > +          _calc_vm_trans(secinfo_flags, SGX_SECINFO_X, PROT_EXEC);
> > > +
> > > +   /*
> > > +    * TCS pages must always RW set for CPU access while the SECINFO
> > > +    * permissions are *always* zero - the CPU ignores the user provided
> > > +    * values and silently overwrites them with zero permissions.
> > > +    */
> > > +   if ((secinfo_flags & SGX_SECINFO_PAGE_TYPE_MASK) == SGX_SECINFO_TCS)
> > > +           prot |= PROT_READ | PROT_WRITE;
> > > +
> > > +   /* Calculate maximum of the VM flags for the page. */
> > > +   encl_page->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
> >
> > During mprotect (in mm/mprotect.c line 525) the following checks if
> > READ_IMPLIES_EXECUTE and a PROT_READ is being requested. If so and
> > VM_MAYEXEC is set, it also adds PROT_EXEC to the request.
> >
> >       if (rier && (vma->vm_flags & VM_MAYEXEC))
> >               prot |= PROT_EXEC;
> >
> > But if we look at sgx_encl_page_alloc(), we see vm_max_prot_bits is set
> > without taking VM_MAYEXEC into account:
> >
> >       encl_page->vm_max_prot_bits = calc_vm_prot_bits(prot, 0);
> >
> > sgx_encl_may_map() checks that the requested protection can be added with:
> >
> >       if (!page || (~page->vm_max_prot_bits & vm_prot_bits))
> >               return -EACCESS
> >
> > This means that for any process where READ_IMPLIES_EXECUTE is set and
> > page where (vma->vm_flags & VM_MAYEXEC) == true, mmap/mprotect calls to
> > that request PROT_READ on a page that was not added with PROT_EXEC will
> > fail.
>
> I could've sworn this was discussed on the SGX list at one point, but
> apparently we only discussed it internally.  Anyways...
>
> More than likely, the READ_IMPLIES_EXECUTE (RIE) crud rears its head
> because part of the enclave loader is written in assembly.  Unless
> explicitly told otherwise, the linker assumes that any program with
> assembly code may need an executable stack, which leads to the RIE
> personality being set for the process.  Here's a fantastic write up for
> more details: https://www.airs.com/blog/archives/518
>
> There are essentially two paths we can take:
>
>  1) Exempt EPC pages from RIE during mmap()/mprotect(), i.e. don't add
>     PROT_EXEC for enclaves.

Seems reasonable.

Honestly, it probably makes sense to try to exempt almost everything
from RIE.  I'd be a bit surprised if RIE is actually useful for
anything other than plain anonymous pages and private file mappings.



More information about the Linux-security-module-archive mailing list