[RFC 0/9] Nginx refcount scalability issue with Apparmor enabled and potential solutions

John Johansen john.johansen at canonical.com
Wed May 29 00:37:12 UTC 2024


On 5/28/24 06:29, Mateusz Guzik wrote:
> On Fri, May 24, 2024 at 11:52 PM John Johansen
> <john.johansen at canonical.com> wrote:
>>
>> On 5/24/24 14:10, Mateusz Guzik wrote:
>>> On Fri, Mar 8, 2024 at 9:09 PM John Johansen
>>> <john.johansen at canonical.com> wrote:
>>>>
>>>> On 3/2/24 02:23, Mateusz Guzik wrote:
>>>>> On 2/9/24, John Johansen <john.johansen at canonical.com> wrote:
>>>>>> On 2/6/24 20:40, Neeraj Upadhyay wrote:
>>>>>>> Gentle ping.
>>>>>>>
>>>>>>> John,
>>>>>>>
>>>>>>> Could you please confirm that:
>>>>>>>
>>>>>>> a. The AppArmor refcount usage described in the RFC is correct?
>>>>>>> b. Approach taken to fix the scalability issue is valid/correct?
>>>>>>>
>>>>>>
>>>>>> Hi Neeraj,
>>>>>>
>>>>>> I know your patchset has been waiting on review for a long time.
>>>>>> Unfortunately I have been very, very busy lately. I will try to
>>>>>> get to it this weekend, but I can't promise that I will be able
>>>>>> to get the review fully done.
>>>>>>
>>>>>
>>>>> Gentle prod.
>>>>>
>>>>> Any chances of this getting reviewed in the foreseeable future? Would
>>>>> be a real bummer if the patchset fell through the cracks.
>>>>>
>>>>
>>>> yes, sorry I have been unavailable for the last couple of weeks. I am
>>>> now back, I have a rather large backlog to try catching up on but this
>>>> is has an entry on the list.
>>>>
>>>
>>> So where do we stand here?
>>>
>> sorry I am still trying to dig out of my backlog, I will look at this,
>> this weekend.
>>
> 
> How was the weekend? ;)
> 

lets say it was busy. Have I looked at this, yes. I am still digesting it.
I don't have objections to moving towards percpu refcounts, but the overhead
of a percpu stuct per label is a problem when we have thousands of labels
on the system. That is to say, this would have to be a config option. We
moved buffers from kmalloc to percpu to reduce memory overhead to reduce
contention. The to percpu, to a global pool because the percpu overhead was
too high for some machines, and then from a global pool to a hybrid scheme
because of global lock contention. I don't see a way of doing that with the
label, which means a config would be the next best thing.

Not part of your patch but something to be considered is that the label tree
needs a rework, its locking needs to move to read side a read side lock less
scheme, and the plan was to make it also use a linked list such that new
labels are always queued at the end, allowing dynamically created labels to
be lazily added to the tree.

I see the use of the kworker as problematic as well, especially if we are
talking using kconfig to switch reference counting modes. I am futzing with
some ideas, on how to deal with this.

Like I said I am still digesting.




More information about the Linux-security-module-archive mailing list