[PATCH RFC v3 00/10] coredump: add coredump socket
Christian Brauner
brauner at kernel.org
Mon May 5 14:56:04 UTC 2025
On Mon, May 05, 2025 at 04:41:28PM +0200, Mickaël Salaün wrote:
> On Mon, May 05, 2025 at 01:13:38PM +0200, Christian Brauner wrote:
> > Coredumping currently supports two modes:
> >
> > (1) Dumping directly into a file somewhere on the filesystem.
> > (2) Dumping into a pipe connected to a usermode helper process
> > spawned as a child of the system_unbound_wq or kthreadd.
> >
> > For simplicity I'm mostly ignoring (1). There's probably still some
> > users of (1) out there but processing coredumps in this way can be
> > considered adventurous especially in the face of set*id binaries.
> >
> > The most common option should be (2) by now. It works by allowing
> > userspace to put a string into /proc/sys/kernel/core_pattern like:
> >
> > |/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
> >
> > The "|" at the beginning indicates to the kernel that a pipe must be
> > used. The path following the pipe indicator is a path to a binary that
> > will be spawned as a usermode helper process. Any additional parameters
> > pass information about the task that is generating the coredump to the
> > binary that processes the coredump.
> >
> > In the example core_pattern shown above systemd-coredump is spawned as a
> > usermode helper. There's various conceptual consequences of this
> > (non-exhaustive list):
> >
> > - systemd-coredump is spawned with file descriptor number 0 (stdin)
> > connected to the read-end of the pipe. All other file descriptors are
> > closed. That specifically includes 1 (stdout) and 2 (stderr). This has
> > already caused bugs because userspace assumed that this cannot happen
> > (Whether or not this is a sane assumption is irrelevant.).
> >
> > - systemd-coredump will be spawned as a child of system_unbound_wq. So
> > it is not a child of any userspace process and specifically not a
> > child of PID 1. It cannot be waited upon and is in a weird hybrid
> > upcall which are difficult for userspace to control correctly.
> >
> > - systemd-coredump is spawned with full kernel privileges. This
> > necessitates all kinds of weird privilege dropping excercises in
> > userspace to make this safe.
> >
> > - A new usermode helper has to be spawned for each crashing process.
> >
> > This series adds a new mode:
> >
> > (3) Dumping into an abstract AF_UNIX socket.
> >
> > Userspace can set /proc/sys/kernel/core_pattern to:
> >
> > @linuxafsk/coredump_socket
> >
> > The "@" at the beginning indicates to the kernel that the abstract
> > AF_UNIX coredump socket will be used to process coredumps.
> >
> > The coredump socket uses the fixed address "linuxafsk/coredump.socket"
> > for now.
> >
> > The coredump socket is located in the initial network namespace. To bind
> > the coredump socket userspace must hold CAP_SYS_ADMIN in the initial
> > user namespace. Listening and reading can happen from whatever
> > unprivileged context is necessary to safely process coredumps.
> >
> > When a task coredumps it opens a client socket in the initial network
> > namespace and connects to the coredump socket. For now only tasks that
> > are acctually coredumping are allowed to connect to the initial coredump
> > socket.
>
> I think we should avoid using abstract UNIX sockets, especially for new
Abstract unix sockets are at the core of a modern Linux system. During
boot alone about 100 or so are created on a modern system when I counted
during testing. Sorry, but this is a no-show argument.
More information about the Linux-security-module-archive
mailing list