1

I want to get the efficiency of one algorithm. The algorithm is implemented by myself to source code or just from a library. The main function will execute the algorithm by calling either the function I implemented, or the interface of the shared library. Now I want to get the accurate the execution time and cpu cycles consumed by the algorithm

Linux + GCC + C language

Thanks

river
  • 694
  • 6
  • 22

3 Answers3

3

Consider the code from SO: Get CPU cycle count?

static inline uint64_t get_cycles()
{
  uint64_t t;
  __asm volatile ("rdtsc" : "=A"(t));
  return t;
}

and implement a class like the following:

class ScopedTimer
{
  public:
    ScopedTime () 
    {
      m_start = get_cycles ()
    }

    ~ScopedTimer () 
    {
      auto diff = get_cycles() - m_start;
      std::cout << "Takes " << diff << " cycles" << std::endl;
    }

  private:
   uint64_t m_start;    
};

Finally you can simply use that class in your code with:

void job () {
  ScopedTimer timer;
  // do some job
  // leaving the scope will automatically print the message in the desctrutor.
}

I have some similar code that automatically counts some statistics in different categories. However, mainly in the destructor you have to accumulate the cycles into a statistic class or something else.

Community
  • 1
  • 1
sfrehse
  • 1,062
  • 9
  • 22
0

You can instrument your code, calling functions that return time stamp counters, and add logic to calculate how much time your code needed.
This obviously changes the code, and you also need to make the changes yourself, so it's work.
You can time the whole program, but that doesn't give you a lot of fine-grained data on the way the time is used.

The best way to measure this sort of thing is to run the program in a profiler. If you're on a new-ish Linux, you have perf available to you.

EOF
  • 6,273
  • 2
  • 26
  • 50
  • `perf` is a profiler that's integrated in the Linux kernel. It gives a lot of useful info. I believe there's a tag for it on stackoverflow. – EOF Nov 20 '14 at 13:38
  • " run the program in a profiler " Will gprof help ? or valgirnd ? Or use " __asm volatile ("rdtsc" : "=A"(t)); " provided by the first floor ? Thanks – river Nov 20 '14 at 13:50
  • @river: I've no experience with `gprof`. `valgrind` is effectively emulation, to catch bugs in the program. It is *much* slower than native execution, so I wouldn't recommend using it to profile. – EOF Nov 20 '14 at 13:54
0

Note that rtdsc is architecture dependent. However, the get_cycles is an architecture independent approach.

ftrace is an interesting utility for measuring latency and it can be very helpful for performance optimization.

Karthik Balaguru
  • 7,424
  • 7
  • 48
  • 65