Reading performance counters at this rate is a bit of a stretch in terms of overhead. That is exactly the reason why perf stat
has a lower limit of 10 ms
periods. It runs a userspace task for reading the counters in those intervals.
On the other hand, perf record
will setup the perf events such that they are recorded by the kernel itself on an overflow of the counter. The advantage is that it has less overhead, but the event is not necessarily recorded in regular time intervals. If you set perf record --frequency 1000
, the kernel will adapt the overflow rate of the counter trying to achieve the requested 1 millisecond intervals. The resulting time intervals will not be constant unless your event rate is really stable. If your event rate varies greatly, so will the time intervals.
Note that there is a mechanism in the kernel that will try to prevent perf from causing too much overhead. At your requested rate you will probably hit it.
Also you should not setup recording for an excessive amount of pids, instead setup a system-wide recording e.g.:
perf record --all-cpus --timestamp --freq 1000
You get one result file that you can process according to the pid. perf script
. In addition to the text output, perf script
allows you to process the events in python or perl (see man perf-script-python
, man perf-script-perl
).