[RFC PATCH v1 2/3] LSM/x86/sgx: Implement SGX specific hooks in SELinux

Tue Jun 11 22:55:42 UTC 2019

> From: linux-sgx-owner at vger.kernel.org [mailto:linux-sgx-
> owner at vger.kernel.org] On Behalf Of Stephen Smalley
> Sent: Tuesday, June 11, 2019 6:40 AM
> 
> >
> > +#ifdef CONFIG_INTEL_SGX
> > +	rc = sgxsec_mprotect(vma, prot);
> > +	if (rc <= 0)
> > +		return rc;
> 
> Why are you skipping the file_map_prot_check() call when rc == 0?
> What would SELinux check if you didn't do so -
> FILE__READ|FILE__WRITE|FILE__EXECUTE to /dev/sgx/enclave?  Is it a
> problem to let SELinux proceed with that check?

We can continue the check. But in practice, all FILE__{READ|WRITE|EXECUTE} are needed for every enclave, then what's the point of checking them? FILE__EXECMOD may be the only flag that has a meaning, but it's kind of redundant because sigstruct file was checked against that already.

> > +static int selinux_enclave_load(struct file *encl, unsigned long addr,
> > +				unsigned long size, unsigned long prot,
> > +				struct vm_area_struct *source)
> > +{
> > +	if (source) {
> > +		/**
> > +		 * Adding page from source => EADD request
> > +		 */
> > +		int rc = selinux_file_mprotect(source, prot, prot);
> > +		if (rc)
> > +			return rc;
> > +
> > +		if (!(prot & VM_EXEC) &&
> > +		    selinux_file_mprotect(source, VM_EXEC, VM_EXEC))
> 
> I wouldn't conflate VM_EXEC with PROT_EXEC even if they happen to be
> defined with the same values currently.  Elsewhere the kernel appears to
> explicitly translate them ala calc_vm_prot_bits().

Thanks! I'd change them to PROT_EXEC in the next version.

> 
> Also, this will mean that we will always perform an execute check on all
> sources, thereby triggering audit denial messages for any EADD sources
> that are only intended to be data.  Depending on the source, this could
> trigger PROCESS__EXECMEM or FILE__EXECMOD or FILE__EXECUTE.  In a world
> where users often just run any denials they see through audit2allow,
> they'll end up always allowing them all.  How can they tell whether it
> was needed? It would be preferable if we could only trigger execute
> checks when there is some probability that execute will be requested in
> the future.  Alternatives would be to silence the audit of these
> permission checks always via use of _noaudit() interfaces or to silence
> audit of these permissions via dontaudit rules in policy, but the latter
> would hide all denials of the permission by the process, not just those
> triggered from security_enclave_load().  And if we silence them, then we
> won't see them even if they were needed.

*_noaudit() is exactly what I wanted. But I couldn't find selinux_file_mprotect_noaudit()/file_has_perm_noaudit(), and I'm reluctant to duplicate code. Any suggestions?

> 
> > +			prot = 0;
> > +		else {
> > +			prot = SGX__EXECUTE;
> > +			if (source->vm_file &&
> > +			    !file_has_perm(current_cred(), source->vm_file,
> > +					   FILE__EXECMOD))
> > +				prot |= SGX__EXECMOD;
> 
> Similarly, this means that we will always perform a FILE__EXECMOD check
> on all executable sources, triggering audit denial messages for any EADD
> source that is executable but to which EXECMOD is not allowed, and again
> the most common pattern will be that users will add EXECMOD to all
> executable sources to avoid this.
> 
> > +		}
> > +		return sgxsec_eadd(encl, addr, size, prot);
> > +	} else {
> > +		/**
> > +		  * Adding page from NULL => EAUG request
> > +		  */
> > +		return sgxsec_eaug(encl, addr, size, prot);
> > +	}
> > +}
> > +
> > +static int selinux_enclave_init(struct file *encl,
> > +				const struct sgx_sigstruct *sigstruct,
> > +				struct vm_area_struct *vma)
> > +{
> > +	int rc = 0;
> > +
> > +	if (!vma)
> > +		rc = -EINVAL;
> 
> Is it ever valid to call this hook with a NULL vma?  If not, this should
> be handled/prevented by the caller.  If so, I'd just return -EINVAL
> immediately here.

vma shall never be NULL. I'll update it in the next version.

> 
> > +
> > +	if (!rc && !(vma->vm_flags & VM_EXEC))
> > +		rc = selinux_file_mprotect(vma, VM_EXEC, VM_EXEC);
> 
> I had thought we were trying to avoid overloading FILE__EXECUTE (or
> whatever gets checked here, e.g. could be PROCESS__EXECMEM or
> FILE__EXECMOD) on the sigstruct file, since the caller isn't truly
> executing code from it.

Agreed. Another problem with FILE__EXECMOD on the sigstruct file is that user code would then be allowed to modify SIGSTRUCT at will, which effectively wipes out the protection provided by FILE__EXECUTE.

> 
> I'd define new ENCLAVE__* permissions, including an up-front
> ENCLAVE__INIT permission that governs whether the sigstruct file can be
> used at all irrespective of memory protections.

Agreed.

> 
> Then you can also have ENCLAVE__EXECUTE, ENCLAVE__EXECMEM,
> ENCLAVE__EXECMOD for the execute-related checks.  Or you can use the
> /dev/sgx/enclave inode as the target for the execute checks and just
> reuse the file permissions there.

Now we've got 2 options - 1) New ENCLAVE__* flags on sigstruct file or 2) FILE__* on /dev/sgx/enclave. Which one do you think makes more sense?

ENCLAVE__EXECMEM seems to offer finer granularity (than PROCESS__EXECMEM) but I wonder if it'd have any real use in practice.

> > +int sgxsec_mprotect(struct vm_area_struct *vma, size_t prot) {
> > +	struct enclave_sec *esec;
> > +	int rc;
> > +
> > +	if (!vma->vm_file || !(esec = __esec(selinux_file(vma->vm_file))))
> {
> > +		/* Positive return value indicates non-enclave VMA */
> > +		return 1;
> > +	}
> > +
> > +	down_read(&esec->sem);
> > +	rc = enclave_mprotect(&esec->regions, vma->vm_start, vma->vm_end,
> > +prot);
> 
> Why is it safe for this to only use down_read()? enclave_mprotect() can
> call enclave_prot_set_cb() which modifies the list?

Probably because it was too late at night when I wrote this line:-( Good catch!

> 
> I haven't looked at this code closely, but it feels like a lot of SGX-
> specific logic embedded into SELinux that will have to be repeated or
> reused for every security module.  Does SGX not track this state itself?

I can tell you have looked quite closely, and I truly think you for your time!

You are right that there are SGX specific stuff. More precisely, SGX enclaves don't have access to anything except memory, so there are only 3 questions that need to be answered for each enclave page: 1) whether X is allowed; 2) whether W->X is allowed and 3 whether WX is allowed. This proposal tries to cache the answers to those questions upon creation of each enclave page, meaning it involves a) figuring out the answers and b) "remember" them for every page. #b is generic, mostly captured in intel_sgx.c, and could be shared among all LSM modules; while #a is SELinux specific. I could move intel_sgx.c up one level in the directory hierarchy if that's what you'd suggest.

By "SGX", did you mean the SGX subsystem being upstreamed? It doesn’t track that state. In practice, there's no way for SGX to track it because there's no vm_ops->may_mprotect() callback. It doesn't follow the philosophy of Linux either, as mprotect() doesn't track it for regular memory. And it doesn't have a use without LSM, so I believe it makes more sense to track it inside LSM.