[PATCH net-next 06/10] cipso_ipv4: use iph_set_totlen in skbuff_setattr

Mon Jan 16 19:33:50 UTC 2023

On Mon, Jan 16, 2023 at 1:13 PM Paul Moore <paul at paul-moore.com> wrote:
>
> On Mon, Jan 16, 2023 at 12:37 PM Xin Long <lucien.xin at gmail.com> wrote:
> > On Mon, Jan 16, 2023 at 11:46 AM Paul Moore <paul at paul-moore.com> wrote:
> > > On Sat, Jan 14, 2023 at 12:54 PM Xin Long <lucien.xin at gmail.com> wrote:
> > > > On Sat, Jan 14, 2023 at 10:39 AM Paul Moore <paul at paul-moore.com> wrote:
> > > > > On Fri, Jan 13, 2023 at 10:31 PM Xin Long <lucien.xin at gmail.com> wrote:
> > > > > >
> > > > > > It may process IPv4 TCP GSO packets in cipso_v4_skbuff_setattr(), so
> > > > > > the iph->tot_len update should use iph_set_totlen().
> > > > > >
> > > > > > Note that for these non GSO packets, the new iph tot_len with extra
> > > > > > iph option len added may become greater than 65535, the old process
> > > > > > will cast it and set iph->tot_len to it, which is a bug. In theory,
> > > > > > iph options shouldn't be added for these big packets in here, a fix
> > > > > > may be needed here in the future. For now this patch is only to set
> > > > > > iph->tot_len to 0 when it happens.
> > > > >
> > > > > I'm not entirely clear on the paragraph above, but we do need to be
> > > > > able to set/modify the IP options in cipso_v4_skbuff_setattr() in
> > > > > order to support CIPSO labeling.  I'm open to better and/or
> > > > > alternative solutions compared to what we are doing now, but I can't
> > > > > support a change that is a bug waiting to bite us.  My apologies if
> > > > > I'm interpreting your comments incorrectly and that isn't the case
> > > > > here.
> > > > setting the IP options may cause the packet size to grow (both iph->tot_len
> > > > and skb->len), for example:
> > > >
> > > > before setting it, iph->tot_len=65535,
> > > > after setting it, iph->tot_len=65535 + 14 (assume the IP option len is 14)
> > > > however, tot_len is 16 bit, and can't be set to "65535 + 14".
> > > >
> > > > Hope the above makes it clearer.
> > >
> > > Thanks, it does.
> > >
> > > > This problem exists with or without this patch. Not sure how it should
> > > > be fixed in cipso_v4 ...
> > > >
> > > > Not sure if we can skip the big packet, or segment/fragment the big packet
> > > > in cipso_v4_skbuff_setattr().
> > >
> > > We can't skip the CIPSO labeling as that would be the network packet
> > > equivalent of not assigning a owner/group/mode to a file on the
> > > filesystem, which is a Very Bad Thing :)
> > >
> > > I spent a little bit of time this morning looking at the problem and I
> > > think the right approach is two-fold: first introduce a simple check
> > > in cipso_v4_skbuff_setattr() which returns -E2BIG if the packet length
> > > grows beyond 65535.  It's rather crude, but it's a tiny patch and
> > > should at least ensure that the upper layers (NetLabel and SELinux)
> > > don't send the packet with a bogus length field; it will result in
> > > packet drops, but honestly that seems preferable to a mangled packet
> > > which will likely be dropped at some point in the network anyway.
> > >
> > > diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c
> > > index 6cd3b6c559f0..f19c9beda745 100644
> > > --- a/net/ipv4/cipso_ipv4.c
> > > +++ b/net/ipv4/cipso_ipv4.c
> > > @@ -2183,8 +2183,10 @@ int cipso_v4_skbuff_setattr(struct sk_buff *skb,
> > >         * that the security label is applied to the packet - we do the same
> > >         * thing when using the socket options and it hasn't caused a problem,
> > >         * if we need to we can always revisit this choice later */
> > > -
> > >        len_delta = opt_len - opt->optlen;
> > > +       if ((skb->len + len_delta) > 65535)
> > > +               return -E2BIG;
> > > +
> >
> > Right, looks crude. :-)
>
> Yes, but what else can we do?  There is fragmentation, but that is
> rather ugly and we would still need a solution for when the don't
> fragment bit is set.  I'm open to suggestions.
looking at ovs_dp_upcall(), for GSO/GRO packets it goes to
queue_gso_packets() where it calls __skb_gso_segment()
to segment it into small segs/skbs, then process these segs instead.

I'm thinking you can try to do the same in cipso_v4_skbuff_setattr(),
and I don't think 64K non-GSO packets exist in the user environment,
so taking care of GSO packets should be enough.

I just don't know if the security_hook will be able to process these
smaller segs/skbs after the segment.

>
> > Note that for BIG TCP packets skb->len is bigger than 65535.
> > (I assume CIPSO labeling will drop BIG TCP packets.)
>
> It seems like there is still ongoing discussion about even enabling
> BIG TCP for IPv4, however for this discussion let's assume that BIG
> TCP is merged for IPv4.
>
> We really should have a solution that allows CIPSO for both normal and
> BIG TCP, if we don't we force distros and admins to choose between the
> two and that isn't good.  We should do better.  If skb->len > 64k in
> the case of BIG TCP, how is the packet eventually divided/fragmented
> in such a way that the total length field in the IPv4 header doesn't
> overflow?  Or is that simply handled at the driver/device layer and we
> simply set skb->len to whatever the size is, regardless of the 16-bit
Yes, for BIG TCP, 16-bit length is set to 0, and it just uses skb->len
as the IP packet length.

> length limit?  If that is the case, does the driver/device layer
> handle copying the IPv4 options and setting the header/total-length
> fields in each packet?  Or is it something else completely?
Yes, I think the driver/device layer will handle copying the IPv4 options
and setting the header/total-length, and that's how it works.

>
> > >        /* if we don't ensure enough headroom we could panic on the skb_push()
> > >         * call below so make sure we have enough, we are also "mangling" the
> > >         * packet so we should probably do a copy-on-write call anyway */
> > >
> > > The second step will be to add a max-length IPv4 option filled with
> > > IPOPT_NOOP to the local sockets in the case of
> > > netlbl_sock_setattr()/NETLBL_NLTYPE_ADDRSELECT.  In this case we would
> > > either end up replacing the padding with a proper CIPSO option or
> > > removing it completely in netlbl_skbuff_setattr(); in either case the
> > > total packet length remains the same or decreases so we should be
> > > "safe".
> >
> > sounds better.
>
> To be clear, it's not a one-or-the-other choice, both patches would be
> necessary as the padding route described in the second step would only
> apply to traffic generated locally from a socket.  We still have the
> ugly problem of dealing with forwarded traffic, which it looks like we
> discuss more on that below ...
I got that.

>
> > > The forwarded packet case is still a bit of an issue, but I think the
> > > likelihood of someone using 64k max-size IPv4 packets on the wire
> > > *and* passing this traffic through a Linux cross-domain router with
> > > CIPSO (re)labeling is about as close to zero as one can possibly get.
> > > At least given the size check present in step one (patchlet above) the
> > > packet will be safely (?) dropped on the Linux system so an admin will
> > > have some idea where to start looking.
> >
> > don't know the likelihood of CIPSO (re)labeling on a Linux cross-domain router.
> > But 64K packets could be GRO packets merged on the interface (GRO enabled)
> > of the router, not directly from the wire.
>
> In the GRO case, is it safe to grow the packet such that skb->len is
> greater than 64k?  I presume that the device/driver is going to split
> the packet anyway and populate the IPv4 total length fields in the
> header anyway, right?  If we can't grow the packet beyond 64k, is
> there some way to signal to the driver/device at runtime that the
> largest packet we can process is 64k minus 40 bytes (for the IPv4
> options)?
at runtime, not as far as I know.
It's a field of the network device that can be modified by:
# ip link set dev eth0 gro_max_size $MAX_SIZE gso_max_size $MAX_SIZE

>
> > > I've got some basic patches written for both, but I need to at least
> > > give them a quick sanity test before posting.
> >
> > Good that you can take care of it from the security side.
>
> It's not yet clear to me that we can, see the above questions.
>
> > I think a similar problem exists in calipso_skbuff_setattr() in
> > net/ipv6/calipso.c.
>
> Probably, but one mess at a time, especially since several of the
> questions above apply to both IPv4 and IPv6.
>
Just wanted you to know that IPv6 BIG TCP is something already
in the kernel, unlike IPv4 BIG TCP, we can avoid this problem in
advance before it's in the kernel.

Thanks.