[PATCH v4 5/5] doc: Add documentation for the fs.open_mayexec_enforce sysctl

Mickaël Salaün mic at digikod.net
Thu Apr 30 13:23:20 UTC 2020

This sysctl enables to propagate executable permission to userspace
thanks to the O_MAYEXEC flag.

Signed-off-by: Mickaël Salaün <mic at digikod.net>
Reviewed-by: Thibaut Sautereau <thibaut.sautereau at ssi.gouv.fr>
Cc: Aleksa Sarai <cyphar at cyphar.com>
Cc: Al Viro <viro at zeniv.linux.org.uk>
Cc: Jonathan Corbet <corbet at lwn.net>
Cc: Kees Cook <keescook at chromium.org>

Changes since v3:
* Switch back to O_MAYEXEC and highlight that it is only taken into
  account by openat2(2).

Changes since v2:
* Update documentation with the new RESOLVE_MAYEXEC.
* Improve explanations, including concerns about LD_PRELOAD.

Changes since v1:
* Move from LSM/Yama to sysctl/fs .
 Documentation/admin-guide/sysctl/fs.rst | 44 +++++++++++++++++++++++++
 1 file changed, 44 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst
index 2a45119e3331..d55615c36772 100644
--- a/Documentation/admin-guide/sysctl/fs.rst
+++ b/Documentation/admin-guide/sysctl/fs.rst
@@ -37,6 +37,7 @@ Currently, these files are in /proc/sys/fs:
 - inode-nr
 - inode-state
 - nr_open
+- open_mayexec_enforce
 - overflowuid
 - overflowgid
 - pipe-user-pages-hard
@@ -165,6 +166,49 @@ system needs to prune the inode list instead of allocating
+While being ignored by :manpage:`open(2)` and :manpage:`openat(2)`, the
+``O_MAYEXEC`` flag can be passed to :manpage:`openat2(2)` to only open regular
+files that are expected to be executable.  If the file is not identified as
+executable, then the syscall returns -EACCES.  This may allow a script
+interpreter to check executable permission before reading commands from a file,
+or a dynamic linker to only load executable shared objects.  One interesting
+use case is to enforce a "write xor execute" policy through interpreters.
+The ability to restrict code execution must be thought as a system-wide policy,
+which first starts by restricting mount points with the ``noexec`` option.
+This option is also automatically applied to special filesystems such as /proc
+.  This prevents files on such mount points to be directly executed by the
+kernel or mapped as executable memory (e.g. libraries).  With script
+interpreters using the ``O_MAYEXEC`` flag, the executable permission can then
+be checked before reading commands from files. This makes it possible to
+enforce the ``noexec`` at the interpreter level, and thus propagates this
+security policy to scripts.  To be fully effective, these interpreters also
+need to handle the other ways to execute code: command line parameters (e.g.,
+option ``-e`` for Perl), module loading (e.g., option ``-m`` for Python),
+stdin, file sourcing, environment variables, configuration files, etc.
+According to the threat model, it may be acceptable to allow some script
+interpreters (e.g. Bash) to interpret commands from stdin, may it be a TTY or a
+pipe, because it may not be enough to (directly) perform syscalls.
+There are two complementary security policies: enforce the ``noexec`` mount
+option, and enforce executable file permission.  These policies are handled by
+the ``fs.open_mayexec_enforce`` sysctl (writable only with ``CAP_MAC_ADMIN``)
+as a bitmask:
+1 - Mount restriction: checks that the mount options for the underlying VFS
+    mount do not prevent execution.
+2 - File permission restriction: checks that the to-be-opened file is marked as
+    executable for the current process (e.g., POSIX permissions).
+Code samples can be found in tools/testing/selftests/openat2/omayexec_test.c
+and at
+https://github.com/clipos-archive/clipos4_portage-overlay/search?q=O_MAYEXEC .
 overflowgid & overflowuid

More information about the Linux-security-module-archive mailing list