0

does anyone know how to get the maximum event period value (or the value that kernel actually passes to PMU) of Perf event?

I'm using perf to measure my program as follow: perf record -d -e cpu/event=0xd0,umask=0x81/ppu,cpu/event=0xd0,umask=0x82/ppu -c 5

cpu/event=0xd0,umask=0x81/ppu means measure all loads in cpu, and cpu/event=0xd0,umask=0x82/ppu is all stores.

I tried to understand how arguments passing in perf by strace, but found nothing.

Is the PMU received a value that over its ability, will still try to reach it? If so, where can find related code and what is its maximum event period of those events?

Thanks everyone.

Kaniel Venson
  • 181
  • 2
  • 12
  • 1
    What is an event period value? – Margaret Bloom Jul 05 '17 at 14:16
  • Hi Margaret, it means that after how many occurrences PMU will collect a sample record. – Kaniel Venson Jul 05 '17 at 14:53
  • For example, -c 1000, means after 1000 times occurrences of event will generate a sample record – Kaniel Venson Jul 05 '17 at 14:55
  • Ah, what Intel calls *Counter mask (CMASK)* in its manual. It's an 8-bit field, so I guess 255 is the max? – Margaret Bloom Jul 05 '17 at 14:59
  • But looking at perf manual, maybe it's something else – Margaret Bloom Jul 05 '17 at 15:01
  • Yes, CMASK seems like a little counter inside CPU, but it should be different from perf -c. Otherwise perf will not allow -c more than 255. – Kaniel Venson Jul 05 '17 at 15:11
  • I'm confused, when I enter -c 1 or 100, the number of measured data is the same. This may be implied that due to the value I specified is over its ability, so the kernel change it and passing a value greater than mine (maybe 5000) to PMU in somewhere, but I have no idea.. – Kaniel Venson Jul 05 '17 at 15:19
  • Hi @KanielVenson can you please mention what kind of program are you running, how big is it ? How many samples are you collecting ? Usually if you give the period as 1, it will try to collect each memory load event, it is usually larger than if the period is 100. However, setting the period as 1 is quite bad for the CPU, if you are running a program with lots of memory-loads and stores occuring at frequent intervals. And of course, -c allows for much larger values than 255. – Arnabjyoti Kalita Jul 05 '17 at 15:35
  • Thanks @ArnabjyotiKalita, my program has large amount of access (test data over 60m actions), but when I use above command (with -c 1000) to measure, result shows only 41k samples, this is illogical – Kaniel Venson Jul 05 '17 at 15:55

1 Answers1

0

The perf record command accepts period values much larger than 255. Internally, the processor maintains a counter for recording all the memory loads and memory stores(or for that matter, any other supported event). Once the counter overflows, the processor will record all the information about the memory load/store that you are trying to record(information about architectural state/registers etc.) .

Also once the counter overflows, it must be reset again. Usually the counter is reset to a value less than 0. Since it is set to a value less than zero and it increments, the counter will overflow once it hits 0 again.

This counter reset value that I was talking about is the period value that you asked for. What I mean is that, if the period is specified by -c 1 , it means that the counter reset value will be set to -1, so the next memory load/store will increment the counter to 0(leading to a counter overflow) and you will record the events.

Thus, if you set the period to 1, there will be a counter overflow on each memory load/store event and you will record all of them (this is only conceptual however, the hardware usually cannot do this).

What this means is that, the period value can go as large as the size of a hardware counter for these events. Usually in modern microarchitectures , like Broadwell/Haswell/Skylake, these counters are 48-bits in size. So the period might go as large as 2^48-1. However, usage of such large values are not recommended.

Usually, the period value should be kept to a maximum of 2^32-1 in 32-bit systems and is usually the norm in other systems too.

Sources :

  1. Chapter 18 of this book

  2. Please read the topic Sampling with perf record in this link too

  3. If you want you can read the answer to this question too.

Arnabjyoti Kalita
  • 2,325
  • 1
  • 18
  • 31
  • Yes I understand what you said, but my question is, if the PMU really act as you said, with `-c 1` why the numbers of results I get so much less? – Kaniel Venson Jul 05 '17 at 16:46
  • I have read intel SDM and manual of perf-record, but cannot explain this phenomenon – Kaniel Venson Jul 05 '17 at 16:48
  • Hi @KanielVenson, it is simply not possible to record all memory access events. A small value of -c is not recommended as it affects the CPU performance. The CPU will throttle because there are interrupts happening every time the buffers get full. It is simply beyond the control of the hardware. – Arnabjyoti Kalita Jul 05 '17 at 16:51
  • But I'm using PEBS, it will not trigger interrupt until the record filled with PEBS buffer. If as you said, CPU throttled because too many interrupt coming, should has lots of sample record in the result, right? – Kaniel Venson Jul 05 '17 at 17:01
  • Hi @KanielVenson, there are many other factors in picture here. You cannot simply record "lots of" samples. There might be other CPU parameters which will automatically reduce the rate of sample collection, please check `dmesg` or other kernel logs. You might get some hints there. – Arnabjyoti Kalita Jul 05 '17 at 17:11
  • Got it, I'll check the dmesg first. Very appreciated your help. – Kaniel Venson Jul 05 '17 at 17:40
  • Hi @ArnabjyotiKalita, according to the document in Documentation/sysctl/kernel.txt, CPU throttle mechanism seems only change the `kernel.perf_event_max_sample_rate` to lower value. Do you know what relation between `kernel.perf_event_max_sample_rate` and `event_period` in perf_event_attr? `sample_rate` seems like description of sampling frequency, but `event_period` is based on the number of event occurrence. – Kaniel Venson Jul 06 '17 at 02:58
  • 1
    I do not think they are directly related, however, if CPU throttling happens, this parameter perf_event_max_sample_rate will reduce. If this reduces, the number of samples collected will reduce. So even though you are trying to collect all events, you cannot collect them. Try changing this parameter and see if you can change the number of events recorded. – Arnabjyoti Kalita Jul 06 '17 at 03:35
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/148538/discussion-between-arnabjyoti-kalita-and-kaniel-venson). – Arnabjyoti Kalita Jul 06 '17 at 18:27
  • Sure, but I still have no idea in the actual meaning of parameter -c and perf_max_cpu_percent, I'm trying to grab more info about it – Kaniel Venson Jul 07 '17 at 01:47