[PATCH RFC] perf_event: Add support for LSM and SELinux checks

Thu Oct 10 15:13:33 UTC 2019

On Thu, Oct 10, 2019 at 10:12:51AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 09, 2019 at 04:36:57PM -0400, Joel Fernandes (Google) wrote:
> > In currentl mainline, the degree of access to perf_event_open(2) system
> > call depends on the perf_event_paranoid sysctl.  This has a number of
> > limitations:
> > 
> > 1. The sysctl is only a single value. Many types of accesses are controlled
> >    based on the single value thus making the control very limited and
> >    coarse grained.
> > 2. The sysctl is global, so if the sysctl is changed, then that means
> >    all processes get access to perf_event_open(2) opening the door to
> >    security issues.
> > 
> > This patch adds LSM and SELinux access checking which will be used in
> > Android to access perf_event_open(2) for the purposes of attaching BPF
> > programs to tracepoints, perf profiling and other operations from
> > userspace. These operations are intended for production systems.
> > 
> > 5 new LSM hooks are added:
> > 1. perf_event_open: This controls access during the perf_event_open(2)
> >    syscall itself. The hook is called from all the places that the
> >    perf_event_paranoid sysctl is checked to keep it consistent with the
> >    systctl. The hook gets passed a 'type' argument which controls CPU,
> >    kernel and tracepoint accesses (in this context, CPU, kernel and
> >    tracepoint have the same semantics as the perf_event_paranoid sysctl).
> >    Additionally, I added an 'open' type which is similar to
> >    perf_event_paranoid sysctl == 3 patch carried in Android and several other
> >    distros but was rejected in mainline [1] in 2016.
> > 
> > 2. perf_event_alloc: This allocates a new security object for the event
> >    which stores the current SID within the event. It will be useful when
> >    the perf event's FD is passed through IPC to another process which may
> >    try to read the FD. Appropriate security checks will limit access.
> > 
> > 3. perf_event_free: Called when the event is closed.
> > 
> > 4. perf_event_read: Called from the read(2) system call path for the event.
> 
> 	+ mmap()
> > 
> > 5. perf_event_write: Called from the read(2) system call path for the event.
> 
> 	- read() + ioctl()

Fixed.

> 
> fresh from the keyboard.. but maybe consoldate things a little.

Looks great to me, I folded it into the patch. Thanks Peter! Just one comment
on change in existing logic of the code, below:

[snip]
> --- a/arch/x86/events/intel/p4.c
> +++ b/arch/x86/events/intel/p4.c
> @@ -8,7 +8,6 @@
>   */
>  
>  #include <linux/perf_event.h>
> -#include <linux/security.h>
>  
>  #include <asm/perf_event_p4.h>
>  #include <asm/hardirq.h>
> @@ -777,10 +776,7 @@ static int p4_validate_raw_event(struct
>  	 * the user needs special permissions to be able to use it
>  	 */
>  	if (p4_ht_active() && p4_event_bind_map[v].shared) {
> -		if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
> -			return -EACCES;
> -
> -		v = security_perf_event_open(&event->attr, PERF_SECURITY_CPU);
> +		v = perf_allow_cpu(&event->attr);
>  		if (v)
>  			return v;
>  	}
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -56,6 +56,7 @@ struct perf_guest_info_callbacks {
>  #include <linux/perf_regs.h>
>  #include <linux/cgroup.h>
>  #include <linux/refcount.h>
> +#include <linux/security.h>
>  #include <asm/local.h>
>  
>  struct perf_callchain_entry {
> @@ -1244,19 +1245,28 @@ extern int perf_cpu_time_max_percent_han
>  int perf_event_max_stack_handler(struct ctl_table *table, int write,
>  				 void __user *buffer, size_t *lenp, loff_t *ppos);
>  
> -static inline bool perf_paranoid_tracepoint_raw(void)
> +static inline int perf_allow_kernel(struct perf_event_attr *attr)
>  {
> -	return sysctl_perf_event_paranoid > -1;
> +	if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN))
> +		return -EACCES;
> +
> +	return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
>  }
>  
> -static inline bool perf_paranoid_cpu(void)
> +static inline int perf_allow_cpu(struct perf_event_attr *attr)
>  {
> -	return sysctl_perf_event_paranoid > 0;
> +	if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN))
> +		return -EACCES;
> +
> +	return security_perf_event_open(attr, PERF_SECURITY_CPU);
>  }
>  
> -static inline bool perf_paranoid_kernel(void)
> +static inline int perf_allow_tracepoint(struct perf_event_attr *attr)
>  {
> -	return sysctl_perf_event_paranoid > 1;
> +	if (sysctl_perf_event_paranoid > -1 && !capable(CAP_SYS_ADMIN))
> +		return -EPERM;
> +

Here the sysctl check of > -1 also is now coupled with a CAP_SYS_ADMIN check.
However..

> +	return security_perf_event_open(attr, PERF_SECURITY_TRACEPOINT);

>  }
>  
>  extern void perf_event_init(void);
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -4229,10 +4229,7 @@ find_get_context(struct pmu *pmu, struct
>  
>  	if (!task) {
>  		/* Must be root to operate on a CPU event: */
> -		if (perf_paranoid_cpu() && !capable(CAP_SYS_ADMIN))
> -			return ERR_PTR(-EACCES);
> -
> -		err = security_perf_event_open(&event->attr, PERF_SECURITY_CPU);
> +		err = perf_allow_cpu(&event->attr);
>  		if (err)
>  			return ERR_PTR(err);
>  
> @@ -5862,14 +5859,8 @@ static int perf_mmap(struct file *file,
>  	lock_limit >>= PAGE_SHIFT;
>  	locked = atomic64_read(&vma->vm_mm->pinned_vm) + extra;
>  
> -	if (locked > lock_limit) {
> -		if (perf_paranoid_tracepoint_raw() && !capable(CAP_IPC_LOCK)) {
> -			ret = -EPERM;
> -			goto unlock;
> -		}
> -
> -		ret = security_perf_event_open(&event->attr,
> -					       PERF_SECURITY_TRACEPOINT);
> +	if (locked > lock_limit && !capable(CAP_IPC_LOCK)) {
> +		ret = perf_allow_tracepoint(&event->attr);

In previous code, this check did not involve a check for CAP_SYS_ADMIN.

I am Ok with adding the CAP_SYS_ADMIN check as well which does make sense to
me for tracepoint access. But it is still a change in the logic so I wanted
to bring it up.

Let me know any other thoughts and then I'll post a new patch.

thanks,

- Joel

[snip]