6

As we all know, perf is the tool to get the CPU performance counter for a program, such as cache-miss, cache-reference, instruction executed etc.

Question : How to get those performance counters for just a piece of code (such as a function) in one program in c or c++.
For example, my program firstly do some initializing, then do the work, then finalize, i just want to get the performance counter for the work, such as function do_something_1 .

int main(int argc, char ** argv) {
    do_initialize();
    for (int i = 0;i < 100 ;i ++) {
        /* begin profile code */
        do_something_1();
        /* end profile code */
        do_something_2();
    } 
    do_finalize();
}
pgplus1628
  • 1,294
  • 1
  • 16
  • 22
  • @displayName yes. AFAIK, Some intel cpu has a PMU (performance monitoring unit) , which can be used to measure those events. Then i can measure a piece of code of those events by reseting and reading counters in PMU . – pgplus1628 Oct 19 '15 at 06:03
  • simply using linux API is enough, check https://stackoverflow.com/questions/42088515/perf-event-open-how-to-monitoring-multiple-events for an example – TingQian LI Feb 18 '22 at 08:36

5 Answers5

6

Finally, i found a library to get those counter for a piece of code.

PAPI

For example, if you want to measure L3 data cache read for some piece of code.

#include "papi.h"
#include <iostream>
#include <glog/logging.h>

#define ASIZE 2684354560

#define event_count (1) // the number of event you want to trace

int main(int argc, char ** argv) {

  int events[event_count] = {PAPI_L3_DCR}; // L3 Data Cache Read
  int ret;
  long long int values[event_count]; // result  

  int* array = new int [ASIZE ];

  /* start counters */
  ret = PAPI_start_counters(events, event_count);
  CHECK_EQ(ret, PAPI_OK);

  size_t tot_cnt = 1;
  for(size_t cnt = 0; cnt < tot_cnt; cnt ++) {
    for(size_t i = 0;i < ASIZE ;i ++) {
      array[i] = i;
    }
  }

  /* read counters */
  ret = PAPI_read_counters(values, event_count);
  CHECK_EQ(ret, PAPI_OK);

  for(size_t i = 0;i < event_count ;i ++) {
    LOG(INFO) << " " << values[i];
  }
  return 0;
}

Makefile :

CXX?=g++
INC?=-I<path to where papi is installed>/include/
LIB?=-L<path to where papi is installed>/lib/ -lpapi -lglog

main : main.cpp
  ${CXX} -O3 ${INC} -o $@ $< ${LIB}

all : main

.PHONY:
clean :
  rm -f main
pgplus1628
  • 1,294
  • 1
  • 16
  • 22
1

You can use operf (oprofile).

In short:

# Build you program with debugging information
# Start up the profiler
operf /path/to/mybinary
# generate a profile summary
opreport  --symbols
# produce some annotated source
opannotate --source --output-dir=/path/to/annotated-source

Example annotated output:

$ opannotate --source --output-dir=/home/moz/src/annotated `which oprofiled`
$ vi /home/moz/src/annotated/home/moz/src/oprofile/daemon/opd_image.c # the annotated source output
...
               :static uint64_t pop_buffer_value(struct transient * trans)
   254  2.4909 :{ /* pop_buffer_value total:   2105 20.6433 */
               :        uint64_t val;
               :
   160  1.5691 :        if (!trans->remaining) {
               :                fprintf(stderr, "BUG: popping empty buffer    !\n");
               :                exit(EXIT_FAILURE);
               :        }
               :
               :        val = get_buffer_value(trans->buffer, 0);
   123  1.2062 :        trans->remaining--;
    65  0.6374 :        trans->buffer += kernel_pointer_size;
               :        return val;
   230  2.2556 :}

Examples

doqtor
  • 8,414
  • 2
  • 20
  • 36
1

I did do some survey to solving the same problem in my project. I did find another framework called SkyPat (https://skypat.skymizer.com) which can get the PMU counters for a piece of code like PAPI.

I have tried both of PAPI and SkyPat to get the PMU counters for a function. I think the difference between of them is that SkyPat combines unit tests and perf_evnet. It refers the concept of Google Test and provides an interface to access PMU, so it’s easy to integrate with Google Test.

For example, if you want to measure cache references and cache for a function.

#include <unistd.h>
#include "pat/pat.h"
#include "test.h"

PAT_F(MyCase, my_test)
{
  int result = 0;

  COUNT(pat::CONTEXT_SWITCHES) {
    test(10);
  }
  COUNT(pat::CPU_CLOCK) {
    test(10);
  }
  COUNT(pat::TASK_CLOCK) {
    test(10);
  }
  COUNT(pat::CACHE_REFERENCES) {
    test(10);
  }
  COUNT(pat::CACHE_MISSES) {
    test(10);
  }
}

int main(int argc, char* argv[])
{
  pat::Test::Initialize(&argc, argv);
  pat::Test::RunAll();
}

And the result log of SkyPat.

[    pat   ] Running 1 tests from 1 cases.
[----------] 1 test from MyCase.
[ RUN      ] MyCase.my_test
[ TIME (ns)]         2537         1000          843         1855         1293
[EVENT TYPE] [CTX SWITCH] [CPU  CLOCK] [TASK CLOCK] [CACHE  REF] [CACHE MISS]
[RESULT NUM]            0          982          818            2            0
[==========] 1 test from 1 cases ran.
[  PASSED  ] 1 test.
Pisco
  • 41
  • 1
0

It sounds you look for profiling.

As you say you are under linux so have a look for gprof toolchain. Simply you have to compiler your prog with some compiler options and start your program. gprof after that inspect the generated profiling data and provide a result which contains information's for each code block.

First: Compile your prog with additional options:

g++ <source> -c -g -pg
...

Second: Link, you also need these options!

g++ <object1> <object2> ... <objectn> -g -pg -o <target>

Third: run your prog

./<target>

After that, get statistics:

gprof <target>
Klaus
  • 24,205
  • 7
  • 58
  • 113
  • thanks for you reply, my question is how to get performance counter for **a piece of code** , not the whole program. – pgplus1628 Jun 08 '15 at 13:09
  • If you only compile a single module with `pg` and link with `pg` the results show wrong over all performance but correct method/function timing values. You are free to split your functions/methods in different files. – Klaus Jun 08 '15 at 13:17
  • Since my project is a large one, extract the function out will introduce a lot of efforts. Anyway, thank your for your reply~ – pgplus1628 Oct 22 '15 at 04:45
0

I am facing the same situation as yours and I did some study on this. Here is what I learned. Firstly, perf is included as a part of kernel and you could check its headers in

/usr/src/kernels/$VERSION/include/linux/perf_regs.h /usr/src/kernels/$VERSION/include/linux/perf_event.h /usr/src/kernels/$VERSION/include/uapi/linux/perf_event.h

And I think the core file is perf_event.h You could also check its github website which has some clarification on how to use it. But it is not clear and now I still have many confusions.

In addition, I found a library very useful called pfmlib which is a helper library to program the perf events. It has examples and perf_examples for instructing how to do this in code-level. I am still working on it. Hope this help you. If you have some questions, we could study from each other.

The website of pfmlib is http://perfmon2.sourceforge.net.

L.Y.
  • 21
  • 1
  • 3
  • thank you for you answer. I find a way to solve my problem. – pgplus1628 Oct 19 '15 at 06:25
  • Yeah. I also find the way to solve this problem yesterday night by using the library of perf part in linux kernel. Thanks for your comment! – L.Y. Oct 19 '15 at 15:07
  • Does " library of perf part in linux kernel " is the "pfmlib" you mentioned above? – pgplus1628 Oct 20 '15 at 03:10
  • Not exactly. libpfm is libpfm4 which is a helper library of the perf part in linux kernel. You could say they are the same in some degree. But libperf4 offers us an easier way to utilise perf (linux kernel part) to test our code. This is what I understand currently. Actually, I solved my problem by using libpfm. You could visit this website which uses perf in linux to test a function printf.http://web.eece.maine.edu/~vweaver/projects/perf_events/perf_event_open.html – L.Y. Oct 20 '15 at 03:58
  • You are welcome : ) by the way, I am sorry for my typo, libpfm4, libperf4 and libpfm are the same thing which is libpfm4. – L.Y. Oct 21 '15 at 23:06