trace_printk issue. Was: [PATCH bpf-next] bpf, capabilities: introduce CAP_BPF

Alexei Starovoitov ast at fb.com
Fri Oct 4 19:56:04 UTC 2019


On 10/3/19 9:41 AM, Steven Rostedt wrote:
> On Thu, 3 Oct 2019 09:18:40 -0700
> Alexei Starovoitov <alexei.starovoitov at gmail.com> wrote:
> 
>> I think dropping last events is just as bad. Is there a mode to overwrite old
>> and keep the last N (like perf does) ?
> 
> Well, it drops it by pages. Thus you should always have the last page
> of events.
> 
>> Peter Wu brought this issue to my attention in
>> commit 55c33dfbeb83 ("bpf: clarify when bpf_trace_printk discards lines").
>> And later sent similar doc fix to ftrace.rst.
> 
> It was documented there, he just elaborated on it more:
> 
>          This file holds the output of the trace in a human
>          readable format (described below). Note, tracing is temporarily
> -       disabled while this file is being read (opened).
> +       disabled when the file is open for reading. Once all readers
> +       are closed, tracing is re-enabled.
> 
> 
>> To be honest if I knew of this trace_printk quirk I would not have picked it
>> as a debugging mechanism for bpf.
>> I urge you to fix it.
> 
> It's not a trivial fix by far.
> 
> Note, trying to read the trace file without disabling the writes to it,
> will most likely make reading it when function tracing enabled totally
> garbage, as the buffer will most likely be filled for every read event.
> That is, each read event will not be related to the next event that is
> read, making it very confusing.
> 
> Although, I may be able to make it work per page. That way you get at
> least a page worth of events.

That sounds much better. As long as trace_printk() doesn't disappear
into the void, it's good.

But the part I'm not getting is why trace_printk() has
if (tracing_disabled) goto out;

It's a concurrent ring buffer. One cpu can write into it while
another reading. What is the point disabling trace_printk in particular?
Each __buffer_unlock_commit is an atomic ring buffer update,
so read from trace will either see it as a whole or won't see it.
'trace_pipe' clearly works fine. Why 'trace' is any different?
Just keep tracing enabled and keep reading it until the end of current
ring buffer. Whether open() determines current or it reads until next=0
is an implementation detail.



More information about the Linux-security-module-archive mailing list