Collecting the data for a partiulcar process from PMU for every 1 milli second

Question

I would like to access the Hardware performance counters for a particular PID for every 1 milli second and save the output to a text file.

The below code collects the data of all the processes running in the system in parallel for a certain duration and then outputs it to a text file.

    #!/bin/sh 
    #set -x 
    ps -ef | awk '{printf($2)"\n";}' > out.txt 
    sed '1d' out.txt > tmp 
    IFS=$'\n'
    while read tmp 
    do  
    3>results-$tmp perf stat -p $tmp --log-fd 3 sleep 5 > /dev/null &
    done <tmp

In order to collect the stats for every 1 milli second for a process, how should a loop be written ?

Apart from perf, is there any other way to monitor the details every 1 milli second ? — SRT, Apr 10 '18 at 21:27
Use `perf stat --interval-print msecs -p $tmp` so you're not trying to fork+exec a new `perf` process every millisecond. The manual says the minimum interval is 10ms, for that, but you could maybe build a custom version. The default timeslice is normally 10ms, so you might a kernel configured differently, maybe with HZ=1000 if Linux still uses HZ. (I haven't paid attention to scheduler tick resolution vs. NO_HZ tickless kernels recently.) — Peter Cordes, Apr 10 '18 at 23:42
`perf record --timestamp will record timestamps on events`. The manual says "you can use `perf report -D` to see the timestamps". I think something like `perf report` might be your best bet for recording things, and then process that data later. If you need something to happen every 1ms, you definitely want to avoid running a shell loop while recording data; that's a lot of system load. Use `perf`'s system-wide mode to have one instance of `perf` collect data from everything. — Peter Cordes, Apr 10 '18 at 23:46
So to monitor something in terms of 1 milli seconds, a custom script has to be written, since perf cannot do it. — SRT, Apr 11 '18 at 00:05
No, you want a customized version of `perf`. A bash script would be too high overhead to do something every 1 ms, especially for every process in the system. — Peter Cordes, Apr 11 '18 at 00:24
`perf record --all-cpus --timestamp` can efficiently collect the raw data you need; then the trick is to process it into what you want, which I guess is some kind of per-process report for each ms. Try `perf report -D` and see if it does anything close to what you want. Read the manual for `perf`. — Peter Cordes, Apr 11 '18 at 00:25
A customized version as in this link https://stackoverflow.com/questions/42088515/perf-event-open-how-to-monitoring-multiple-events , and There are 2 lines in the code which include the timespec struct and the members are only seconds and nano seconds. struct timespec time, time2; time.tv_sec = 1; time.tv_nsec = 0; and by giving the value of 1000000 time.tv_nsec = 10000000; , .Would this be a customized version ? — SRT, Apr 11 '18 at 00:48
What exactly do you want to know about your system that you can't learn with a normal `perf record --all-cpus --timestamp` / `perf report -D`, or other monitoring tools? — Peter Cordes, Apr 11 '18 at 02:36
The normal perf cannot monitor below 10 milli seconds.I would like to monitor an application running in my system for every 1 milli second. — SRT, Apr 11 '18 at 15:49
`perf stat` can only log every 10ms. `perf record --timestamp` records all events as they happen, with high-resolution timestamps. The 10ms limit doesn't apply in any way to `perf record`. — Peter Cordes, Apr 11 '18 at 19:59

score 2 · Answer 1 · answered Apr 12 '18 at 11:22

2

Reading performance counters at this rate is a bit of a stretch in terms of overhead. That is exactly the reason why perf stat has a lower limit of 10 ms periods. It runs a userspace task for reading the counters in those intervals.

On the other hand, perf record will setup the perf events such that they are recorded by the kernel itself on an overflow of the counter. The advantage is that it has less overhead, but the event is not necessarily recorded in regular time intervals. If you set perf record --frequency 1000, the kernel will adapt the overflow rate of the counter trying to achieve the requested 1 millisecond intervals. The resulting time intervals will not be constant unless your event rate is really stable. If your event rate varies greatly, so will the time intervals.

Note that there is a mechanism in the kernel that will try to prevent perf from causing too much overhead. At your requested rate you will probably hit it.

Also you should not setup recording for an excessive amount of pids, instead setup a system-wide recording e.g.:

perf record --all-cpus --timestamp --freq 1000

You get one result file that you can process according to the pid. perf script. In addition to the text output, perf script allows you to process the events in python or perl (see man perf-script-python, man perf-script-perl).

answered Apr 12 '18 at 11:22

Zulan

21,896
6
49
109

Oh cool, so you can do even better than I was suggesting in comments. Also related: [Maximum sampling frequency supported by perf](//stackoverflow.com/q/49807213) for a link to the internals of the code that limits sampling frequency. – Peter Cordes Apr 13 '18 at 04:56
@PeterCordes BTW with `perf_event_open` you can setup a group of metrics and trigger samples for the group leader (e.g. ref-cycles with a stable rate). Unfortunately that doesn't seem to work with `perf record`, even if you use groups. – Zulan Apr 13 '18 at 07:58
So now to know the stats for a particular pid , let’s say pid 50 , a perf-script has to be run so that it would output the stats into a text file ? For example , can we have the text file outputs for pid-1 as output-1.txt ? – SRT Apr 16 '18 at 17:51
Using the command perf record -e cycles --all-cpus --freq 1000 in order to give me the cpu cycles for all the cpu's. It is taking a lot of time and gives nothing. Is there anything wrong with the command I used or the flags? – SRT Apr 17 '18 at 21:54
Did you stop the recording once you have covered the time you wanted to monitor using ctrl+c? – Zulan Apr 18 '18 at 07:33

Collecting the data for a partiulcar process from PMU for every 1 milli second

1 Answers1

Linked