[RFC PATCH v1 0/2] Add capabilities file to sysfs

Francis Laniel flaniel at linux.microsoft.com
Wed Dec 29 20:56:50 UTC 2021


Le mercredi 29 décembre 2021, 02:26:20 CET Casey Schaufler a écrit :
> On 12/28/2021 5:34 AM, Francis Laniel wrote:
> > Hi.
> > 
> > Le lundi 27 décembre 2021, 23:22:41 CET Casey Schaufler a écrit :
> >> On 12/27/2021 12:54 PM, Francis Laniel wrote:
> >>> Hi.
> >>> 
> >>> 
> >>> First, I hope you are fine and the same for your relatives.
> >>> 
> >>> Capabilities are used to check if a thread has the right to perform a
> >>> given
> >>> action [1].
> >>> For example, a thread with CAP_BPF set can use the bpf() syscall.
> >>> 
> >>> Capabilities are used in the container world.
> >>> In terms of code, several projects related to container maintain code
> >>> where the capabilities are written alike include/uapi/linux/capability.h
> >>> [2][3][4][5]. For these projects, their codebase should be updated when
> >>> a
> >>> new capability is added to the kernel.
> >>> Some other projects rely on <sys/capability.h> [6].
> >>> In this case, this header file should reflect the capabilities offered
> >>> by
> >>> the kernel.
> >>> 
> >>> So, in this series, I added a new file to sysfs:
> >>> /sys/kernel/capabilities.
> >> 
> >> This should be /sys/kernel/security/capabilities.
> > 
> > I began to write code to move this under /sys/kernel/security/capabilities
> > but I realized this directory is linked to CONFIG_SECURITYFS.
> > This option is not required to be able to run container [1].
> 
> You're going to need to handle the case where the file is missing
> regardless. It is hard to design a kernel feature based on what a
> container expects when there are so many definitions of a container.

The goal would be to always have this file, as if I need to handle the case 
where it is missing, it surely means having hardcoded values for capabilities 
like today situation.
I think it should be always here if its definition lies in a source file marked 
as 'obj-y', right?

Nonetheless, I understand your point of having this "capabilities printing" 
file under /sys/kernel/security.
But, this would lead to add CONFIG_SECURITYFS as a needed CONFIG_ to "run 
container".
And, if "container stack" runs on a kernel which does not provide this option, 
then "container software" would need to rely on hardcoded capabilities (like 
today situation).

> > Also, kernel/capability.c is always compiled, so I think it is better if
> > this file (i.e. the one which prints capabilities to user) does not
> > depend on any CONFIG_.
> 
> CONFIG_MULTUSER is going to be an issue if you really care.

I did not know about CONFIG_MULTIUSER and thinking a bit about at it, it 
should be a needed CONFIG_ to run container.
Nonetheless, when CONFIG_MULTIUSER=n and if my understanding of 2813893f8b197 
is correct, calling capset() leads to sys_ni_syscall() which returns -ENOSYS.
Thus, an user trying to "run container" with specific capabilities on a kernel 
compiled with CONFIG_MULTIUSER=n will have trouble with all capabilities (and 
will then have to first fix its CONFIG_ issue instead of knowing which 
capabilities he/she can use).

> > What do you think of it? Does this sound acceptable for you?
> 
> Meh. I'm not going to get worked up over it, but your rationale
> is a little weak.

I agree this contribution will not revolutionize how user interact with the 
kernel.
But I see two advantages:
1. By removing hardcoded capabilities values from "container software", it 
will ease these software maintainability.
Indeed, someone will not need to add new values to their code base when the 
kernel get a new capability.
2. On the user side, without hardcoded capabilities values, you can use any 
version of "container software" on a recent kernel to be able to use all the 
capabilities the kernel offers.
For the anecdote, it was my case some time ago and I had to compile newer 
version of "container software" to use underlying kernel capabilities [1].
This was not a big deal, but if this "container software" were capabilities 
agnostic, it would have gain me a bit of time.
From this experience, I had the idea of this contribution.

> >>> The goal of this file is to be used by "container world" software to
> >>> know
> >>> kernel capabilities at run time instead of compile time.
> >>> 
> >>> The underlying kernel attribute is read-only and its content is the
> >>> capability number associated with the capability name:
> >>> root at vm-amd64:~# cat /sys/kernel/capabilities
> >>> 0       CAP_CHOWN
> >>> 1       CAP_DAC_OVERRIDE
> >>> ...
> >>> 39      CAP_BPF
> >>> 
> >>> The kernel already exposes the last capability number under:
> >>> /proc/sys/kernel/cap_last_cap
> >>> So, I think there should not be any issue exposing all the capabilities
> >>> it
> >>> offers.
> >>> If there is any, please share it as I do not want to introduce issue
> >>> with
> >>> this series.
> >>> 
> >>> Also, if you see any way to improve this series please share it as it
> >>> would
> >>> increase this contribution quality.
> >>> 
> >>> Francis Laniel (2):
> >>>     capability: Add cap_strings.
> >>>     kernel/ksysfs.c: Add capabilities attribute.
> >>>    
> >>>    include/uapi/linux/capability.h |  1 +
> >>>    kernel/capability.c             | 45
> >>>    +++++++++++++++++++++++++++++++++
> >>>    kernel/ksysfs.c                 | 18 +++++++++++++
> >>>    3 files changed, 64 insertions(+)
> >>> 
> >>> Best regards and thank you in advance for your reviews.
> >>> ---
> >>> [1] man capabilities
> >>> [2]
> >>> https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a56
> >>> 6
> >>> 4fab21d6f7d1e/pkg/cap/cap_linux.go#L135 [3]
> >>> https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902
> >>> a
> >>> abc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c
> >>> 1
> >>> moby relies on containerd code.
> >>> [4]
> >>> https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0
> >>> b
> >>> 7f94e3a0ffb/capability/enum.go#L47 [5]
> >>> https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880
> >>> b
> >>> a0e6380519a/libcontainer/capabilities/capabilities.go#L12 runc relies on
> >>> syndtr package.
> >>> [6]
> >>> https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b8
> >>> 8
> >>> 0c089f1/src/libcrun/linux.c#L35
> > 
> > Best regards.
> > ---
> > [1] https://github.com/moby/moby/blob/
> > 10aecb0e652d346130a37e5b4383eca28f594c21/contrib/check-config.sh
---
[1] https://github.com/kinvolk/minikube/commit/
51bf81c816c004ca0d0f3e3e368bc59f8a208387





More information about the Linux-security-module-archive mailing list