0

I'm trying to profile a C++ application. I've tried gprof, HPCToolkit and ScoreP. My problem is that for different runs I obtain different running times, there's a difference of about 10% from one execution to another (like 2.5 and 2.7 seconds). Why? I remember that when I was using the CrayPat on a cray system there was no difference at all between different executions. Thanks

ps: I'm on debian 8

rosilho
  • 145
  • 1
  • 2
  • 7
  • And those profilers give you the *CPU* time and not the *run* time? Also, most profilers on normal home-computers work by *sampling*, and any sampling method is imprecise, how much depending on sampling rate. – Some programmer dude Sep 16 '15 at 13:00
  • and do you know how I could increase the precision? – rosilho Sep 16 '15 at 13:16
  • @rosilho: It's better if you don't expect precise repeatability of wall-clock elapsed run time, because all but the simplest computers are doing more than one thing at a time. If the reason you are doing this is to find speedups, there's a [*much better method*](http://stackoverflow.com/a/378024/23771). – Mike Dunlavey Sep 16 '15 at 14:08
  • yes I know that the kernel stops the execution as he wants etc. In fact I've tried to put the niceness of the process at minimum but even like that the running time changes. I was hoping that there was a way to give absolute priority to the process so that it can't be stopped. – rosilho Sep 17 '15 at 08:14
  • @rosilho: There's a reason you are profiling. Some people just want to know the numbers, for their own sake. Most are trying to find ways to speed up the code (or "bottlenecks" if you like). To do that, it doesn't matter what the overall time is. It is only necessary to find activities responsible for a *large fraction* of time. Those activities will preferentially appear on stack samples *without needing to be measured*. That's the secret behind [*the method I linked above*](http://stackoverflow.com/a/378024/23771). – Mike Dunlavey Sep 23 '15 at 16:05

1 Answers1

0

Andrei Alexandrescu mentioned in a speech that these days, with modern processors you shouldn't expect repeatability in benchmarks. I think there's two things you can do to make things more predictable. First, run your benchmark for a sufficiently long enough time (I'd advise for something like or close to a minute). And one other thing: make sure power management is off (if you're on an OS and a machine that uses it).