[RFC PATCH 0/9] security: x86/sgx: SGX vs. LSM

Stephen Smalley sds at tycho.nsa.gov
Tue Jun 4 15:33:44 UTC 2019


On 6/3/19 2:30 PM, Xing, Cedric wrote:
>> From: Christopherson, Sean J
>> Sent: Monday, June 03, 2019 10:16 AM
>>
>> On Sun, Jun 02, 2019 at 12:29:35AM -0700, Xing, Cedric wrote:
>>> Hi Sean,
>>>
>>> Generally I agree with your direction but think ALLOW_* flags are
>>> completely internal to LSM because they can be both produced and
>>> consumed inside an LSM module. So spilling them into SGX driver and
>>> also user mode code makes the solution ugly and in some cases
>>> impractical because not every enclave host process has a priori
>>> knowledge on whether or not an enclave page would be EMODPE'd at
>> runtime.
>>
>> In this case, the host process should tag *all* pages it *might* convert
>> to executable as ALLOW_EXEC.  LSMs can (and should/will) be written in
>> such a way that denying ALLOW_EXEC is fatal to the enclave if and only
>> if the enclave actually attempts mprotect(PROT_EXEC).
> 
> What if those pages contain self-modifying code but the host doesn't know ahead of time? Would it require ALLOW_WRITE|ALLOW_EXEC at EADD? Then would it prevent those pages to start with PROT_EXEC?
> 
> Anyway, my point is that it is unnecessary even if it works.
> 
>>
>> Take the SELinux path for example.  The only scenario in which
>> PROT_WRITE is cleared from @allowed_prot is if the page *starts* with
>> PROT_EXEC.
>> If PROT_EXEC is denied on a page that starts RW, e.g. an EAUG'd page,
>> then PROT_EXEC will be cleared from @allowed_prot.
>>
>> As Stephen pointed out, auditing the denials on @allowed_prot means the
>> log will contain false positives of a sort.  But this is more of a noise
>> issue than true false positives.  E.g. there are three possible outcomes
>> for the enclave.
>>
>>    - The enclave does not do EMODPE[PROT_EXEC] in any scenario, ever.
>>      Requesting ALLOW_EXEC is either a straightforward a userspace bug or
>>      a poorly written generic enclave loader.
>>
>>    - The enclave conditionally performs EMODPE[PROT_EXEC].  In this case
>>      the denial is a true false positive.
>>
>>    - The enclave does EMODPE[PROT_EXEC] and its host userspace then fails
>>      on mprotect(PROT_EXEC), i.e. the LSM denial is working as intended.
>>      The audit log will be noisy, but viewed as a whole the denials
>> aren't
>>      false positives.
> 
> What I was talking about was EMODPE[PROT_WRITE] on an RX page.
> 
>>
>> The potential for noisy audit logs and/or false positives is unfortunate,
>> but it's (by far) the lesser of many evils.
>>
>>> Theoretically speaking, what you really need is a per page flag (let's
>>> name it WRITTEN?) indicating whether a page has ever been written to
>>> (or more precisely, granted PROT_WRITE), which will be used to decide
>>> whether to grant PROT_EXEC when requested in future. Given the fact
>>> that all mprotect() goes through LSM and mmap() is limited to
>>> PROT_NONE, it's easy for LSM to capture that flag by itself instead of
>> asking user mode code to provide it.
>>>
>>> That said, here is the summary of what I think is a better approach.
>>> * In hook security_file_alloc(), if @file is an enclave, allocate some
>> data
>>>    structure to store for every page, the WRITTEN flag as described
>> above.
>>>    WRITTEN is cleared initially for all pages.
>>
>> This would effectively require *every* LSM to duplicate the SGX driver's
>> functionality, e.g. track per-page metadata, implement locking to
>> prevent races between multiple mm structs, etc...
> 
> Architecturally we shouldn't dictate how LSM makes decisions. ALLOW_* are no difference than PROCESS__* or FILE__* flags, which are just artifacts to assist particular LSMs in decision making. They are never considered part of the LSM interface, even if other LSMs than SELinux may adopt the same/similar approach.
> 
> If code duplication is what you are worrying about, you can put them in a library, or implement/export them in some new file (maybe security/enclave.c?) as utility functions. But spilling them into user mode is what I think is unacceptable.
> 
>>
>>>    Open: Given a file of type struct file *, how to tell if it is an
>> enclave (i.e. /dev/sgx/enclave)?
>>> * In hook security_mmap_file(), if @file is an enclave, make sure
>> @prot can
>>>    only be PROT_NONE. This is to force all protection changes to go
>> through
>>>    security_file_mprotect().
>>> * In the newly introduced hook security_enclave_load(), set WRITTEN
>> for pages
>>>    that are requested PROT_WRITE.
>>
>> How would an LSM associate a page with a specific enclave?  vma->vm_file
>> will point always point at /dev/sgx/enclave.  vma->vm_mm is useless
>> because we're allowing multiple processes to map a single enclave, not
>> to mention that by mm would require holding a reference to the mm.
> 
> Each open("/dev/sgx/enclave") syscall creates a *new* instance of struct file to uniquely identify one enclave instance. What I mean is @vma->vm_file, not @vma->vm_file->f_path or @vma->vm_file->f_inode.
> 
>>
>>> * In hook security_file_mprotect(), if @vma->vm_file is an enclave,
>> look up
>>>    and use WRITTEN flags for all pages within @vma, along with other
>> global
>>>    flags (e.g. PROCESS__EXECMEM/FILE__EXECMOD in the case of SELinux)
>> to decide
>>>    on allowing/rejecting @prot.
>>
>> vma->vm_file will always be /dev/sgx/enclave at this point, which means
>> LSMs don't have the necessary anchor back to the source file, e.g. to
>> enforce FILE__EXECUTE.  The noexec file system case is also unaddressed.
> 
> vma->vm_file identifies an enclave instance uniquely. FILE__EXECUTE is checked by security_enclave_load() using @source_vma->vm_file. Once a page has been EADD'ed, whether to allow RW->RX depends on .sigstruct file (more precisely, the file backing SIGSTRUCT), whose FILE__* attributes could be cached in vma->vm_file->f_security by security_enclave_init().

The RFC series seemed to dispense with the use of the sigstruct file and 
just used the source file throughout IIUC.  That allowed for reuse of 
FILE__* permissions without ambiguity rather than introducing separate 
ENCLAVE__* permissions or using /dev/sgx/enclave inode as the target of 
all checks.

Regardless, IIUC, your approach requires that we always check 
FILE__EXECMOD, and FILE__EXECUTE up front during security_enclave_load() 
irrespective of prot so that we can save the result in the f_security 
for later use by the mprotect hook.  This may generate many spurious 
audit messages for cases where PROT_EXEC will never be requested, and 
users will be prone to just always allowing it since they cannot tell 
when it was actually needed.

>   
> The noexec case should be addressed in IOC_ADD_PAGES by testing @source_vma->vm_flags & VM_MAYEXEC.
> 
>>
>>> * In hook security_file_free(), if @file is an  enclave, free storage
>>>    allocated for WRITTEN flags.



More information about the Linux-security-module-archive mailing list