[PATCH v2 2/2] selinux: Implement mptcp_add_subflow hook

Mon Jan 9 10:31:04 UTC 2023

Hello,

I'm sorry for the long delay:  I was on PTO with limited internet
access.

On Fri, 2022-12-23 at 12:11 -0500, Paul Moore wrote:
> On Thu, Dec 22, 2022 at 10:57 AM Paolo Abeni <pabeni at redhat.com> wrote:
> > 
> > On Wed, 2022-12-21 at 20:21 -0500, Paul Moore wrote:
> > > On Wed, Dec 21, 2022 at 2:24 PM Paolo Abeni <pabeni at redhat.com> wrote:
> > > > I just tested the other option and there is another problem :(
> > > 
> > > It's never easy, is it? ;)
> > > 
> > > > The first subflow creations happens inside af_inet->create, via the sk-
> > > > > sk_prot->init() hook. The security_socket_post_create() call on the
> > > > owning MPTCP sockets happens after that point. So we copy data from a
> > > > not yet initialized security context (and the test fail badly).
> > > 
> > > Hmmm.  Let's come back to this later on down this email.
> > > 
> > > > There are a few options to cope with that:
> > > > - [ugly hack] call  security_socket_post_create() on the mptcp code
> > > > before creating the subflow. I experimented this just to double the
> > > > problem and a possible solution.
> > > 
> > > I'm guessing "[ugly hack]" is probably a bit of an understatement.
> > > Let's see if we can do better before we explore this option too much
> > > further.
> > 
> > Yup, I compiled the list in "brainstom-mode", trying to include
> > whatever would be possible even if clearly not suitable.
> > 
> > [...]
> > 
> > > > WDYT?
> > > 
> > > Let's go back to the the inet_create() case for a little bit.  I'm
> > > thinking we might be able to do something by leveraging the
> > > sk_alloc()->sk_prot_alloc()->security_sk_alloc() code path.  As
> > > inet_create() is going to be called from task context here, it seems
> > > like we could do the sock's sid/sclass determination here, cached in
> > > separate fields in the sk_security_struct if necessary, and use those
> > > in a new MPTCP subflow hook.  We could also update
> > > selinux_socket_post_create() to take advantage of this as well.  We
> > > could also possibly pass the proto struct into security_sk_alloc() if
> > > we needed to identify IPPROTO_MPTCP there as well.
> > > 
> > > I'll admit to not chasing down all the details, but I suspect this may
> > > be the cleanest option - thoughts?
> > 
> > Thanks, I did not consider such possibility!
> > 
> > I think we should be careful to avoid increasing sk_security_struct
> > size. Currently it is 16 bytes, nicely matching a kmalloc slab, any
> > increase will move it on kmalloc-32 bytes slab possibly causing
> > performance and memory regressions).
> 
> FWIW, it is likely that this will end up growing in the future to
> support stacking LSMs.  It's unfortunate, messy, and generally ugly,
> but inevitable.
> 
> See the selinux_inode() function as a similar example using
> inode/inode_security_struct.
> 
> > More importantly, I think there is a problem with the
> > sk_clone_lock() -> sk_prot_alloc() -> security_sk_alloc()
> > code path.
> > 
> > sk_clone_lock() happens in BH context, if security_transition_sid()
> > needs process context that would be a problem - quickly skimming the
> > code it does not look so, I need to double check.
> 
> The problem is that in both selinux_socket_create() and
> selinux_socket_post_create() the credentials from @current are needed
> to determine the sock/sk_security_stuct label.  In
> selinux_socket_create() @current's credentials are also used to
> determine if the socket can be created.
> 
> It's looking like doing labeling determinations in the
> security_sk_alloc() struct is not going to work.  While
> sk_clone_lock() will end up calling into
> security_sk_clone()/selinux_sk_clone_security() via sock_copy(), if I
> understand you correctly that won't help as the main MPTCP socket is
> not yet setup (e.g. selinux_socket_post_create() has not yet been run
> on the main socket).
> 
> > Perhaps the cleanest option could be the one involving the mptcp
> > refactoring, moving subflow creation at a later stage. It could have
> > some minor side benefit for MPTCP, too - solving:
> > 
> > https://github.com/multipath-tcp/mptcp_net-next/issues/290
> > 
> > but I'm not fond of that option because it will require quite a bit of
> > time: we need first to have the mptcp refactor in place and then cook
> > the lsm patches. I guess such process will require at least 2 release
> > cycles, due to the needed mptcp(netdev)/lsm trees synchronization.
> 
> I generally take the long term view when it comes to Linux Kernel
> development; given the nature of the kernel, and all the constraints
> that come with it, I would much rather pursue solutions that we
> believe have the longest staying power.
> 
> I'm also happy to work on, and/or review, LSM patches in conjunction
> with a MPTCP refactor.  If the only reason to split the work over two
> kernel releases is to soften the blow during the merge window, I think
> we can work that out in a single release ... at least I say that now
> :)

I thought about doing the MPTCP and selinux patches sequentially to
both avoid the possibly untrivial conflicts resultion issues and to
ensure that the mptcp patches are in place when the selinux ones are
applied, as there is a fuctional dependency. 

> Basically when it comes down to it, I want to make sure that any fix
> we come up with *works*.  In my mind that means doing the LSM fix in
> conjunction with the rework; I would hate to see all of the rework
> happen only to find out later that it still didn't solve the LSM
> problem.
> 
> Does that make sense?

Indeed it makes sense to me.

I think we can address that concern in a quite consolidated way. We
usually include in the MPTCP tree a (very limited) number of patches
that will not be submitted to the netdev because belong to other trees
and/or are handled/owned by others devel. 

We use the above e.g. to fix build and/or functional issues in our
self-tests caused by other subsystems without the need to wait for the
proper fix to land into vanilla. When such event happen, we simply drop
the local copy of the fixup patch.

We could use a similar schema in this scenario. We can include the the
LSM patches to the mptcp in our tree while the rework is in progress to
ensure that overall the effort really addresses the LSM issue.

We can rebase the LSM patches as needed to address conflicts as
needed/if/when they pops up.

Once that the mptcp patches will land into the LSM tree, we will submit
formally the LSM ones to you. During the process I'll check and ensure
that the LSM issue is really/still fixed. Would that work for you?

> > If that would prove to be the most reasonable option, could we consider
> > to transiently merge first something alike:
> > 
> > https://lore.kernel.org/mptcp/CAHC9VhSQnhH3UL4gqzu+YiA1Q3YyLLCv88gLJOvw-0+uw5Lvkw@mail.gmail.com/T/#m06c612f84f6b6fe759e670573b2c8092df71607b
> > 
> > to have a workable short-term solution, and later revert it when the
> > final solution would be in place?
> 
> I would need to go back through that to make sure that it makes sense,
> and ensure that the hook is idempotent for SELinux, AppArmor, and
> Smack (current hook implementations), *aaaand* if we promise that this
> is just a *temporary* hack I think I would be okay with that.

I would appreciate that addtional extra mile a lot, as it will allow a
(temporary!) fix to be delivered quite earlier than what the above
process will provide. 

Of course the deal is that I'll take ownership of pursuing the complete
fix till resolution.

Many thanks!

Paolo