[PATCH RFC v3 08/10] net, pidfs, coredump: only allow coredumping tasks to connect to coredump socket

Wed May 7 11:51:28 UTC 2025

On Tue, May 06, 2025 at 12:18:12PM -0700, Kuniyuki Iwashima wrote:
> From: Christian Brauner <brauner at kernel.org>
> Date: Tue, 6 May 2025 10:06:27 +0200
> > On Mon, May 05, 2025 at 09:10:28PM +0200, Jann Horn wrote:
> > > On Mon, May 5, 2025 at 8:41 PM Kuniyuki Iwashima <kuniyu at amazon.com> wrote:
> > > > From: Christian Brauner <brauner at kernel.org>
> > > > Date: Mon, 5 May 2025 16:06:40 +0200
> > > > > On Mon, May 05, 2025 at 03:08:07PM +0200, Jann Horn wrote:
> > > > > > On Mon, May 5, 2025 at 1:14 PM Christian Brauner <brauner at kernel.org> wrote:
> > > > > > > Make sure that only tasks that actually coredumped may connect to the
> > > > > > > coredump socket. This restriction may be loosened later in case
> > > > > > > userspace processes would like to use it to generate their own
> > > > > > > coredumps. Though it'd be wiser if userspace just exposed a separate
> > > > > > > socket for that.
> > > > > >
> > > > > > This implementation kinda feels a bit fragile to me... I wonder if we
> > > > > > could instead have a flag inside the af_unix client socket that says
> > > > > > "this is a special client socket for coredumping".
> > > > >
> > > > > Should be easily doable with a sock_flag().
> > > >
> > > > This restriction should be applied by BPF LSM.
> > > 
> > > I think we shouldn't allow random userspace processes to connect to
> > > the core dump handling service and provide bogus inputs; that
> > > unnecessarily increases the risk that a crafted coredump can be used
> > > to exploit a bug in the service. So I think it makes sense to enforce
> > > this restriction in the kernel.
> > > 
> > > My understanding is that BPF LSM creates fairly tight coupling between
> > > userspace and the kernel implementation, and it is kind of unwieldy
> > > for userspace. (I imagine the "man 5 core" manpage would get a bit
> > > longer and describe more kernel implementation detail if you tried to
> > > show how to write a BPF LSM that is capable of detecting unix domain
> > > socket connections to a specific address that are not initiated by
> > > core dumping.) I would like to keep it possible to implement core
> > > userspace functionality in a best-practice way without needing eBPF.
> > > 
> > > > It's hard to loosen such a default restriction as someone might
> > > > argue that's unexpected and regression.
> > > 
> > > If userspace wants to allow other processes to connect to the core
> > > dumping service, that's easy to implement - userspace can listen on a
> > > separate address that is not subject to these restrictions.
> > 
> > I think Kuniyuki's point is defensible. And I did discuss this with
> > Lennart when I wrote the patch and he didn't see a point in preventing
> > other processes from connecting to the core dump socket. He actually
> > would like this to be possible because there's some userspace programs
> > out there that generate their own coredumps (Python?) and he wanted them
> > to use the general coredump socket to send them to.
> > 
> > I just found it more elegant to simply guarantee that only connections
> > are made to that socket come from coredumping tasks.
> > 
> > But I should note there are two ways to cleanly handle this in
> > userspace. I had already mentioned the bpf LSM in the contect of
> > rate-limiting in an earlier posting:
> > 
> > (1) complex:
> > 
> >     Use a bpf LSM to intercept the connection request via
> >     security_unix_stream_connect() in unix_stream_connect().
> > 
> >     The bpf program can simply check:
> > 
> >     current->signal->core_state
> > 
> >     and reject any connection if it isn't set to NULL.
> > 
> >     The big downside is that bpf (and security) need to be enabled.
> >     Neither is guaranteed and there's quite a few users out there that
> >     don't enable bpf.

The kernel should indeed always have a minimal security policy in place,
LSM can tailored that but we should not assume that a specific LSM with
a specific policy is enabled/configured on the system.

> > 
> > (2) simple (and supported in this series):
> > 
> >     Userspace accepts a connection. It has to get SO_PEERPIDFD anyway.
> >     It then needs to verify:
> > 
> >     struct pidfd_info info = {
> >             info.mask = PIDFD_INFO_EXIT | PIDFD_INFO_COREDUMP,
> >     };
> > 
> >     ioctl(pidfd, PIDFD_GET_INFO, &info);
> >     if (!(info.mask & PIDFD_INFO_COREDUMP)) {
> >             // Can't be from a coredumping task so we can close the
> > 	    // connection without reading.
> > 	    close(coredump_client_fd);
> > 	    return;
> >     }
> > 
> >     /* This has to be set and is only settable by do_coredump(). */
> >     if (!(info.coredump_mask & PIDFD_COREDUMPED)) {
> >             // Can't be from a coredumping task so we can close the
> > 	    // connection without reading.
> > 	    close(coredump_client_fd);
> > 	    return;
> >     }
> > 
> >     // Ok, this is a connection from a task that has coredumped, let's
> >     // handle it.

What if the task send a "fake" coredump and just after that really
coredump?  There could be a race condition on the server side when
checking the coredump property of this pidfd.

Could we add a trusted header to the coredump payload that is always
written by the kernel?  This would enable to read a trusted flag
indicating if the following payload is a coredumped generated by the
kernel or not.

> > 
> >     The crux is that the series guarantees that by the time the
> >     connection is made the info whether the task/thread-group did
> >     coredump is guaranteed to be available via the pidfd.
> >  
> > I think if we document that most coredump servers have to do (2) then
> > this is fine. But I wouldn't mind a nod from Jann on this.
> 
> I like this approach (2) allowing users to filter the right client.
> This way we can extend the application flexibly for another coredump
> service.