[PATCH v1 0/3] Introduce CAP_SYS_PERFMON capability for secure Perf users groups

Stephane Eranian eranian at google.com
Wed Dec 11 19:04:59 UTC 2019


On Thu, Dec 5, 2019 at 9:35 AM Casey Schaufler <casey at schaufler-ca.com> wrote:
>
> On 12/5/2019 9:05 AM, Alexey Budankov wrote:
> > Hello Casey,
> >
> > On 05.12.2019 19:49, Casey Schaufler wrote:
> >> On 12/5/2019 8:15 AM, Alexey Budankov wrote:
> >>> Currently access to perf_events functionality [1] beyond the scope permitted
> >>> by perf_event_paranoid [1] kernel setting is allowed to a privileged process
> >>> [2] with CAP_SYS_ADMIN capability enabled in the process effective set [3].
> >>>
> >>> This patch set introduces CAP_SYS_PERFMON capability devoted to secure performance
> >>> monitoring activity so that CAP_SYS_PERFMON would assist CAP_SYS_ADMIN in its
> >>> governing role for perf_events based performance monitoring of a system.
> >>>
> >>> CAP_SYS_PERFMON aims to harden system security and integrity when monitoring
> >>> performance using perf_events subsystem by processes and Perf privileged users
> >>> [2], thus decreasing attack surface that is available to CAP_SYS_ADMIN
> >>> privileged processes [3].
> >> Are there use cases where you would need CAP_SYS_PERFMON where you
> >> would not also need CAP_SYS_ADMIN? If you separate a new capability
> > Actually, there are. Perf tool that has record, stat and top modes could run with
> > CAP_SYS_PERFMON capability as mentioned below and provide system wide performance
> > data. Currently for that to work the tool needs to be granted with CAP_SYS_ADMIN.
>
> The question isn't whether the tool could use the capability, it's whether
> the tool would also need CAP_SYS_ADMIN to be useful. Are there existing
> tools that could stop using CAP_SYS_ADMIN in favor of CAP_SYS_PERFMON?

The answer is yes. I have recently been alerted to a problem with
paranoid=2 and the
popular rr debugger (https://rr-project.org/). This debugger uses
several perf_events
features, including profiling of PMU events and tracepoints
(context-switches). With
paranoid=2, it does not work anymore. We would need a privilege between regular
user and admin to make it work again. Note that context switches
tracepoint is only
applied to self (not system-wide).


> My bet is that any tool that does performance monitoring is going to need
> CAP_SYS_ADMIN for other reasons.
>
> >
> >> from CAP_SYS_ADMIN but always have to use CAP_SYS_ADMIN in conjunction
> >> with the new capability it is all rather pointless.
> >>
> >> The scope you've defined for this CAP_SYS_PERFMON is very small.
> >> Is there a larger set of privilege checks that might be applicable
> >> for it?
> > CAP_SYS_PERFMON could be applied broadly, though, this patch set enables record
> > and stat mode use cases for system wide performance monitoring in kernel and
> > user modes.
>
> The granularity of capabilities is something we have to watch
> very carefully. Sure, CAP_SYS_ADMIN covers a lot of things, but
> if we broke it up "properly" we'd have hundreds of capabilities.
> If you want control that finely we have SELinux.
>
> >
> > Thanks,
> > Alexey
> >
> >>
> >>
> >>> CAP_SYS_PERFMON aims to take over CAP_SYS_ADMIN credentials related to
> >>> performance monitoring functionality of perf_events and balance amount of
> >>> CAP_SYS_ADMIN credentials in accordance with the recommendations provided in
> >>> the man page for CAP_SYS_ADMIN [3]: "Note: this capability is overloaded;
> >>> see Notes to kernel developers, below."
> >>>
> >>> For backward compatibility reasons performance monitoring functionality of
> >>> perf_events subsystem remains available under CAP_SYS_ADMIN but its usage for
> >>> secure performance monitoring use cases is discouraged with respect to the
> >>> introduced CAP_SYS_PERFMON capability.
> >>>
> >>> In the suggested implementation CAP_SYS_PERFMON enables Perf privileged users
> >>> [2] to conduct secure performance monitoring using perf_events in the scope
> >>> of available online CPUs when executing code in kernel and user modes.
> >>>
> >>> Possible alternative solution to this capabilities balancing, system security
> >>> hardening task could be to use the existing CAP_SYS_PTRACE capability to govern
> >>> perf_events' performance monitoring functionality, since process debugging is
> >>> similar to performance monitoring with respect to providing insights into
> >>> process memory and execution details. However CAP_SYS_PTRACE still provides
> >>> users with more credentials than are required for secure performance monitoring
> >>> using perf_events subsystem and this excess is avoided by using the dedicated
> >>> CAP_SYS_PERFMON capability.
> >>>
> >>> libcap library utilities [4], [5] and Perf tool can be used to apply
> >>> CAP_SYS_PERFMON capability for secure performance monitoring beyond the scope
> >>> permitted by system wide perf_event_paranoid kernel setting and below are the
> >>> steps to evaluate the advancement suggested by the patch set:
> >>>
> >>>   - patch, build and boot the kernel
> >>>   - patch, build Perf tool e.g. to /home/user/perf
> >>>   ...
> >>>   # git clone git://git.kernel.org/pub/scm/libs/libcap/libcap.git libcap
> >>>   # pushd libcap
> >>>   # patch libcap/include/uapi/linux/capabilities.h with [PATCH 1/3]
> >>>   # make
> >>>   # pushd progs
> >>>   # ./setcap "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> >>>   # ./setcap -v "cap_sys_perfmon,cap_sys_ptrace,cap_syslog=ep" /home/user/perf
> >>>   /home/user/perf: OK
> >>>   # ./getcap /home/user/perf
> >>>   /home/user/perf = cap_sys_ptrace,cap_syslog,cap_sys_perfmon+ep
> >>>   # echo 2 > /proc/sys/kernel/perf_event_paranoid
> >>>   # cat /proc/sys/kernel/perf_event_paranoid
> >>>   2
> >>>   ...
> >>>   $ /home/user/perf top
> >>>     ... works as expected ...
> >>>   $ cat /proc/`pidof perf`/status
> >>>   Name:     perf
> >>>   Umask:    0002
> >>>   State:    S (sleeping)
> >>>   Tgid:     2958
> >>>   Ngid:     0
> >>>   Pid:      2958
> >>>   PPid:     9847
> >>>   TracerPid:        0
> >>>   Uid:      500     500     500     500
> >>>   Gid:      500     500     500     500
> >>>   FDSize:   256
> >>>   ...
> >>>   CapInh:   0000000000000000
> >>>   CapPrm:   0000004400080000
> >>>   CapEff:   0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> >>>                                      cap_sys_perfmon,cap_sys_ptrace,cap_syslog
> >>>   CapBnd:   0000007fffffffff
> >>>   CapAmb:   0000000000000000
> >>>   NoNewPrivs:       0
> >>>   Seccomp:  0
> >>>   Speculation_Store_Bypass: thread vulnerable
> >>>   Cpus_allowed:     ff
> >>>   Cpus_allowed_list:        0-7
> >>>   ...
> >>>
> >>> Usage of cap_sys_perfmon effectively avoids unused credentials excess:
> >>> - with cap_sys_admin:
> >>>   CapEff:   0000007fffffffff => 01111111 11111111 11111111 11111111 11111111
> >>> - with cap_sys_perfmon:
> >>>   CapEff:   0000004400080000 => 01000100 00000000 00001000 00000000 00000000
> >>>                                     38   34               19
> >>>                            sys_perfmon   syslog           sys_ptrace
> >>>
> >>> The patch set is for tip perf/core repository:
> >>>   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip perf/core
> >>>   tip sha1: ceb9e77324fa661b1001a0ae66f061b5fcb4e4e6
> >>>
> >>> [1] http://man7.org/linux/man-pages/man2/perf_event_open.2.html
> >>> [2] https://www.kernel.org/doc/html/latest/admin-guide/perf-security.html
> >>> [3] http://man7.org/linux/man-pages/man7/capabilities.7.html
> >>> [4] http://man7.org/linux/man-pages/man8/setcap.8.html
> >>> [5] https://git.kernel.org/pub/scm/libs/libcap/libcap.git
> >>> [6] https://sites.google.com/site/fullycapable/, posix_1003.1e-990310.pdf
> >>>
> >>> ---
> >>> Alexey Budankov (3):
> >>>   capabilities: introduce CAP_SYS_PERFMON to kernel and user space
> >>>   perf/core: apply CAP_SYS_PERFMON to CPUs and kernel monitoring
> >>>   perf tool: extend Perf tool with CAP_SYS_PERFMON support
> >>>
> >>>  include/linux/perf_event.h          |  6 ++++--
> >>>  include/uapi/linux/capability.h     | 10 +++++++++-
> >>>  security/selinux/include/classmap.h |  4 ++--
> >>>  tools/perf/design.txt               |  3 ++-
> >>>  tools/perf/util/cap.h               |  4 ++++
> >>>  tools/perf/util/evsel.c             | 10 +++++-----
> >>>  tools/perf/util/util.c              | 15 +++++++++++++--
> >>>  7 files changed, 39 insertions(+), 13 deletions(-)
> >>>
> >>
>



More information about the Linux-security-module-archive mailing list