Flush the hold queue fall into an infinite loop.

Fri Jan 14 01:22:16 UTC 2022

I want to stop droping the logs into audit_hold_queue when the auditd is abnormal.it
seems that this modification goes against the design intent of audit_hold_queue. its
effect is similar to removing the audit_hold_queue.

diff --git a/kernel/audit.c b/kernel/audit.c
index 2a38cbaf3ddb..a8091b1a6587 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -748,6 +748,7 @@ static int kauditd_send_queue(struct sock *sk, u32 
portid,
                                         (*err_hook)(skb);
                                 if (rc == -EAGAIN)
                                         rc = 0;
+                               audit_default = AUDIT_OFF;
                                 /* continue to drain the queue */
                                 continue;
                         } else
@@ -755,6 +756,7 @@ static int kauditd_send_queue(struct sock *sk, u32 
portid,
                 } else {
                         /* skb sent - drop the extra reference and 
continue */
                         consume_skb(skb);
+                       audit_default = audit_enabled;
                         failed = 0;
                 }
         }

在 2022/1/13 23:22, Paul Moore 写道:
> On Thu, Jan 13, 2022 at 6:57 AM cuigaosheng <cuigaosheng1 at huawei.com> wrote:
>> When we add "audit=1" to the cmdline, kauditd will take up 100%
>> cpu resource.As follows:
>>
>> configurations:
>> auditctl -b 64
>> auditctl --backlog_wait_time 60000
>> auditctl -r 0
>> auditctl -w /root/aaa  -p wrx
>> shell scripts：
>> #!/bin/bash
>> i=0
>> while [ $i -le 66 ]
>> do
>>     touch /root/aaa
>>     let i++
>> done
>> mandatory conditions:
>>
>> add "audit=1" to the cmdline, and kill -19 pid_number(for /sbin/auditd).
>>
>>   As long as we keep the audit_hold_queue non-empty, flush the hold queue will fall into
>>   an infinite loop.
>>
>> 713 static int kauditd_send_queue(struct sock *sk, u32 portid,
>>   714                               struct sk_buff_head *queue,
>>   715                               unsigned int retry_limit,
>>   716                               void (*skb_hook)(struct sk_buff *skb),
>>   717                               void (*err_hook)(struct sk_buff *skb))
>>   718 {
>>   719         int rc = 0;
>>   720         struct sk_buff *skb;
>>   721         unsigned int failed = 0;
>>   722
>>   723         /* NOTE: kauditd_thread takes care of all our locking, we just use
>>   724          *       the netlink info passed to us (e.g. sk and portid) */
>>   725
>>   726         while ((skb = skb_dequeue(queue))) {
>>   727                 /* call the skb_hook for each skb we touch */
>>   728                 if (skb_hook)
>>   729                         (*skb_hook)(skb);
>>   730
>>   731                 /* can we send to anyone via unicast? */
>>   732                 if (!sk) {
>>   733                         if (err_hook)
>>   734                                 (*err_hook)(skb);
>>   735                         continue;
>>   736                 }
>>   737
>>   738 retry:
>>   739                 /* grab an extra skb reference in case of error */
>>   740                 skb_get(skb);
>>   741                 rc = netlink_unicast(sk, skb, portid, 0);
>>   742                 if (rc < 0) {
>>   743                         /* send failed - try a few times unless fatal error */
>>   744                         if (++failed >= retry_limit ||
>>   745                             rc == -ECONNREFUSED || rc == -EPERM) {
>>   746                                 sk = NULL;
>>   747                                 if (err_hook)
>>   748                                         (*err_hook)(skb);
>>   749                                 if (rc == -EAGAIN)
>>   750                                         rc = 0;
>>   751                                 /* continue to drain the queue */
>>   752                                 continue;
>>   753                         } else
>>   754                                 goto retry;
>>   755                 } else {
>>   756                         /* skb sent - drop the extra reference and continue */
>>   757                         consume_skb(skb);
>>   758                         failed = 0;
>>   759                 }
>>   760         }
>>   761
>>   762         return (rc >= 0 ? 0 : rc);
>>   763 }
>>
>> When kauditd attempt to flush the hold queue, the queue parameter is &audit_hold_queue,
>> and if netlink_unicast(line 741 ) return -EAGAIN, sk will be NULL(line 746), so err_hook(kauditd_rehold_skb)
>> will be call. Then continue, skb_dequeue(line 726) and err_hook(kauditd_rehold_skb,line 733) will
>> fall into an infinite loop.
>> I don't really understand the value of audit_hold_queue, can we remove it, or stop droping the logs
>> into kauditd_rehold_skb when the auditd is abnormal?
> Thanks Gaosheng for the bug report, I'm able to reproduce this and I'm
> looking into it now.  I'll report back when I have a better idea of
> the problem and a potential fix.
>