92

So far, I've only used Rational Quantify. I've heard great things about Intel's VTune, but have never tried it!

Edit: I'm mostly looking for software that will instrument the code, as I guess that's about the only way to get very fine results.


See also:

What are some good profilers for native C++ on Windows?

Community
  • 1
  • 1
OysterD
  • 6,660
  • 5
  • 34
  • 33
  • Do you want 1) to measure, or do you want 2) to find speedups? If you want 2, and you think that requires 1, that's not so. To find speedups, you do not need "very fine results". If the program is spending 90% of it's time doing something you could very well remove if you knew what it was, [*stack samples*](http://stackoverflow.com/a/378024/23771) will show it to you 9 times out of 10. If you look at 10 samples, do you care if you see it 10 times, 9 times, or 8 times? Either way, you *know what it is*. The measured percent does not matter. – Mike Dunlavey Oct 09 '16 at 16:18

21 Answers21

38

For linux development (although some of these tools might work on other platforms). These are the two big names I know of, there's plenty of other smaller ones that haven't seen active development in a while.

Al.
  • 1,507
  • 1
  • 15
  • 12
29

For Linux: Google Perftools

  • Faster than valgrind (yet, not so fine grained)
  • Does not need code instrumentation
  • Nice graphical output (--> kcachegrind)
  • Does memory-profiling, cpu-profiling, leak-checking
abcd
  • 10,215
  • 15
  • 51
  • 85
Weidenrinde
  • 2,152
  • 1
  • 20
  • 21
10

IMHO, sampling using a debugger is the best method. All you need is an IDE or debugger that lets you halt the program. It nails your performance problems before you even get the profiler installed.

Hugh Perkins
  • 7,975
  • 7
  • 63
  • 71
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
  • 3
    Yes! This works great for me. It doesnt need instrumentation. It doesnt need any profiler etc installed. On linux, you can use gdb. Program runs at full speed. Hit ctrl-c to halt. type 'bt' to show the stacktrace. Then 'c' to continue, then ctrl-c again. Works great! Just reduced my execution time by 20%, in a complex program, using this technique. Awesome! – Hugh Perkins Dec 02 '14 at 14:19
  • @HughPerkins: Thanks for your edit, and I'm glad you're succeeding. (I bet you can do better than 20% :) – Mike Dunlavey Dec 02 '14 at 14:42
  • 1
    Yes, I got iteration time down from 1200ms to 200ms, in a few hours work, using only gdb + ctrl-c, to locate the hotspots :-) – Hugh Perkins Dec 07 '14 at 08:20
  • @HughPerkins: For me, if I'm working on my own code, it's tough to know when to stop trying - it seems like I can always squeeze it some more. When I'm working on somebody else's code, there can be a problem. I can't always convince the "owner" of the code to fix the problem, so the process stalls. It's an interesting conundrum. – Mike Dunlavey Dec 07 '14 at 15:54
  • If you just want to achieve this without instrumentation, you do not need even a debugger or IDE on Linux. Just run "pstack " to stack trace current instruction running. It is much simpler than launching a Debugger and then breaking manually and then looking for stacktrace. – Manish Sogi May 06 '20 at 04:10
  • @ManishSogi: You're right. Sadly I haven't been on Linux for ages. And actually I like to look at more than just the stack trace, like what are the variables, or step out of whatever routine I'm in, so I can get a better idea what it's doing and why. – Mike Dunlavey May 07 '20 at 01:22
7

My only experience profiling C++ code is with AQTime by AutomatedQA (now SmartBear Software). It has several types of profilers built in (performance, memory, Windows handles, exception tracing, static analysis, etc.), and instruments the code to get the results.

I enjoyed using it - it was always fun to find those spots where a small change in code could make a dramatic improvement in performance.

Matt Dillard
  • 14,677
  • 7
  • 51
  • 61
6

I have never done profiling before. Yesterday I programmed a ProfilingTimer class with a static timetable (a map<std::string, long long>) for time storage.

The constructor stores the starting tick, and the destructor calculates the elapsed time and adds it to the map:

ProfilingTimer::ProfilingTimer(std::string name)
 : mLocalName(name)
{
 sNestedName += mLocalName;
 sNestedName += " > ";

 if(sTimetable.find(sNestedName) == sTimetable.end())
  sTimetable[sNestedName] = 0;

 mStartTick = Platform::GetTimerTicks();
}

ProfilingTimer::~ProfilingTimer()
{
 long long totalTicks = Platform::GetTimerTicks() - mStartTick;

 sTimetable[sNestedName] += totalTicks;

 sNestedName.erase(sNestedName.length() - mLocalName.length() - 3);
}

In every function (or {block}) that I want to profile i need to add:

ProfilingTimer _ProfilingTimer("identifier");

This line is a bit cumbersome to add in all functions I want to profile since I have to guess which functions take a lot of time. But it works well and the print function shows time consumed in %.

(Is anyone else working with any similar "home-made profiling"? Or is it just stupid? But it's fun! Does anyone have improvement suggestions?

Is there some sort of auto-adding a line to all functions?)

Moberg
  • 5,253
  • 4
  • 38
  • 54
5

The profiler in Visual Studio 2008 is very good: fast, user friendly, clear and well integrated in the IDE.

Dimitri C.
  • 21,861
  • 21
  • 85
  • 101
  • 2
    Isn't the profiler in the Team version only? – dwj Feb 17 '10 at 19:39
  • @dwj: I'm not sure. I am using Visual Studio Team System 2008 Development Edition. – Dimitri C. Feb 18 '10 at 07:18
  • Looks like it is only in the Team edition (http://stackoverflow.com/questions/61669/profiling-in-visual-studio-2008-pro/61681#61681) for versions before 2010. – dwj Feb 19 '10 at 18:09
5

I've used Glowcode extensively in the past and have had nothing but positive experiences with it. Its Visual Studio integration is really nice, and it is the most efficient/intuitive profiler that I've ever used (even compared to profilers for managed code).

Obviously, thats useless if your not running on Windows, but the question leaves it unclear to me exactly what your requirements are.

jsight
  • 27,819
  • 25
  • 107
  • 140
5

oprofile, without a doubt; its simple, reliable, does the job, and can give all sorts of nice breakdowns of data.

Dark Shikari
  • 7,941
  • 4
  • 26
  • 38
4

For Windows, check out Xperf. It uses sampled profile, has some useful UI, & does not require instrumentation. Quite useful for tracking down performance problems. You can answer questions like:

  • Who is using the most CPU? Drill down to function name using call stacks.
  • Who is allocating the most memory?
  • Who is doing the most registry queries?
  • Disk writes? etc.

You will be quite surprised when you find the bottlenecks, as they are probably not where you expected!

user15071
  • 3,391
  • 8
  • 31
  • 31
4

Since you don't mention the platform you're working on, I'll say cachegrind under Linux. Definitely. It's part of the Valgrind toolset.

http://valgrind.org/info/tools.html

I've never used its sub-feature Callgrind, since most of my code optimization is for inside functions.

Note that there is a frontend KCachegrind available.

rlerallut
  • 7,545
  • 5
  • 23
  • 21
4

For Windows, I've tried AMD Codeanalyst, Intel VTune and the profiler in Visual Studio Team Edition.

Codeanalyst is buggy (crashes frequently) and on my code, its results are often inaccurate. Its UI is unintuitive. For example, to reach the call stack display in the profile results, you have to click the "Processes" tab, then click the EXE filename of your program, then click a toolbar button with the tiny letters "CSS" on it. But it is freeware, so you may as well try it, and it works (with fewer features) without an AMD processor.

VTune ($700) has a terrible user interface IMO; in a large program, it's hard to find the particular call tree you want, and you can only look at one "node" in a program at a time (a function with its immediate callers and callees)--you cannot look at a complete call tree. There is a call graph view, but I couldn't find a way to make the relative execution times appear on the graph. In other words, the functions in the graph look the same regardless of how much time was spent in them--it's as though they totally missed the point of profiling.

Visual Studio's profiler has the best GUI of the three, but for some reason it is unable to collect samples from the majority of my code (samples are only collected for a few functions in my entire C++ program). Also, I couldn't find a price or way to buy it directly; but it comes with my company's MSDN subscription. Visual Studio supports managed, native, and mixed code; I'm not sure about the other two profilers in that regard.

In conclusion, I don't know of a good profiler yet! I'll be sure to check out the other suggestions here.

Qwertie
  • 16,354
  • 20
  • 105
  • 148
3

For Windows development, I've been using Software Verification's Performance Validator - it's fast, reasonably accurate, and reasonably priced. Best yet, it can instrument a running process, and lets you turn data collection on and off at runtime, both manually and based on the callstack - great for profiling a small section of a larger program.

Shog9
  • 156,901
  • 35
  • 231
  • 235
3

I use devpartner for the pc platform.

EvilTeach
  • 28,120
  • 21
  • 85
  • 141
  • It does instrument the code. It has code coverage, and boundschecking (instrumented and uninstrumented) – EvilTeach Nov 05 '08 at 00:49
3

There are different requirements for profiling. Is instrumented code ok, or do you need to profile optimized code (or even already compiled code)? Do you need line-by-line profile information? Which OS are you running? Do you need to profile shared libraries as well? What about trace into system calls?

Personally, I use oprofile for everything I do, but that might not be the best choice in every case. Vtune and Shark are both excellent as well.

Louis Brandy
  • 19,028
  • 3
  • 38
  • 29
2

The only sensitive answer is PTU from Intel. Of course its best to use it on an Intel processor and to get even more valuable results at least on a C2D machine as the architecture itself is easier to give back meaningful profiles.

Fabien Hure
  • 644
  • 3
  • 7
  • 17
2

I've used VTune under Windows and Linux for many years with very good results. Later versions have gotten worse, when they outsourced that product to their Russian development crew quality and performance both went down (increased VTune crashes, often 15+ minutes to open an analysis file).

Regarding instrumentation, you may find out that it's less useful than you think. In the kind of applications I've worked on adding instrumentation often slows the product down so much that it doesn't work anymore (true story: start app, go home, come back next day, app still initializing). Also, with non instrumented profiling you can react to live problems. For example, with VTune remote date collector I can start up a sampling session against a live server with hundreds of simultaneous connections that is experiencing performance problems and catch issues that happen in production that I'd never be able to replicate in a test environment.

Don Neufeld
  • 22,720
  • 11
  • 51
  • 50
2

ElectricFence works nicely for malloc debugging

Michael McCarty
  • 709
  • 1
  • 12
  • 16
2

I have tried Quantify an AQTime, and Quantify won because of its invaluable 'focus on sub tree' and 'delete sub tree' features.

eli
  • 662
  • 8
  • 18
  • Full ack. I just had to do some profiling on a C++ application, and those were the exact features that really made my day. – Enno Mar 07 '09 at 21:15
1

My favorite tool is Easy Profiler : http://code.google.com/p/easyprofiler/

It's a compile time profiler : the source code must be manually instrumented using a set of routines so to describe the target regions. However, once the application is run, and measures automatically written to an XML file, it is only a matter of opening the Observer application and doing few clicks on the analysis/compare tools, before you can see the result in a qualitative chart.

charfeddine.ahmed
  • 526
  • 2
  • 8
  • 16
1

Visual studio 2010 profiler under Windows. VTune had a great call graph tool, but it got broken as of Windows Vista/7. I don't know if they fixed it.

Coder
  • 3,695
  • 7
  • 27
  • 42
0

Let me give a plug for EQATEC... just what I was looking for... simple to learn and use and gives me the info I need to find the hotspots quickly. I much prefer it to the one built in to Visual Studio (though I haven't tried the VS 2010 one yet, to be fair).

The ability to take snapshots is HUGE. I often get an extra analysis and optimization done while waiting for the real target analysis to run... love it.

Oh, and its base version is free!
http://www.eqatec.com/Profiler/

Brian Kennedy
  • 3,499
  • 3
  • 21
  • 27