6

Debugging performance problems using a standard debugger is almost hopeless since the level of detail is too high. Other ways are using a profiler, but they seldom give me good information, especially when there is GUI and background threads involved, as I never know whether the user was actually waiting for the computer, or not. A different way is simply using Control + C and see where in the code it stops.

What I really would like is to have Fast Forward, Play, Pause and Rewind functionality combined with some visual repressentation of the code. This means that I could set the code to run on Fast Forward until I navigate the GUI to the critical spot. Then I set the code to be run in slow mode, while I get some visual repressentation of, which lines of are being executed (possibly some kind of zoomed out view of the code). I could for example set the execution speed to something like 0.0001x. I believe that I would get a very good visualization this way of whether the problem is inside a specific module, or maybe in the communication between modules.

Does this exist? My specific need is in Python, but I would be interested in seeing such functionality in any language.

David
  • 4,786
  • 11
  • 52
  • 80
  • 2
    `rewind` might be difficult after calling `fire_all_employees()` or `system('rm -rf /')`. But I like the general idea... :) – sarnold Mar 24 '11 at 09:23
  • 1
    It only needed to rewind the visualization of the code execution. I like the idea of automating the firing of employees, as it is a really tedious task. ;) – David Mar 24 '11 at 09:24
  • 1
    So you want something similar to the [Omniscient Debugger](http://www.lambdacs.com/debugger/), right? [TOD](http://pleiad.dcc.uchile.cl/tod/index.html) is another example. They are both for Java, 'though. – Joachim Sauer Mar 24 '11 at 09:28
  • @Joachim Sauer. They are both interesting, but I believe they are lacking the visualization. I am afraid that I would get lost in the details by going through the code line by line. I will check them out, though. – David Mar 24 '11 at 09:33
  • Have you used Eclipse before? PyDev + Eclipse's debugging perspective have typical debugging features, but you can arrange windows to keep one eye on your app and the other on your code. I can't remember if it has timed stepping or not. – detly Mar 24 '11 at 14:48
  • @detly. I am using PyDev, but it has nothing like such features I described here. – David Mar 24 '11 at 15:17

3 Answers3

4

The "Fast Forward to critical spot" function already exists in any debugger, it's called a "breakpoint". There are indeed debuggers that can slow down execution, but that will not help you debug performance problems, because it doesn't slow down the computer. The processor and disk and memory is still exactly as slow as before, all that happens is that the debugger inserts delays between each line of code. That means that every line of code suddenly take more or less the same time, which means that it hides any trace of where the performance problem is.

The only way to find the performance problems is to record every call done in the application and how long it took. This is what a profiler does. Indeed, using a profiler is tricky, but there probably isn't a better option. In theory you could record every call and the timing of every call, and then play that back and forwards with a rewind, but that would use an astonishing amount of memory, and it wouldn't actually tell you anything more than a profiler does (indeed, it would tell you less, as it would miss certain types of performance problems).

You should be able to, with the profiler, figure out what is taking a long time. Note that this can be both by certain function calls taking a long time because they do a lot of processing, or it can be system calls that take a long time becomes something (network/disk) is slow. Or it can be that a very fast call is called loads and loads of times. A profiler will help you figure this out. But it helps if you can turn the profiler on just at the critical section (reduces noise) and if you can run that critical section many times (improves accuracy).

Lennart Regebro
  • 167,292
  • 41
  • 224
  • 251
  • 1
    3rd paragraph, sentences 2 & 3, I agree with (not much else:-) The only profilers I think are much good are the ones that 1) sample the call stack (not just the PC), 2) can take samples during I/O as well as computing, 3) report by line (not just function) percent of samples containing the line (not invocation counts, not time measurements, especially not "self time" (Grrr...), and 4) let you control when samples are taken. [See this.](http://stackoverflow.com/questions/1777556/alternatives-to-gprof/1779343#1779343) – Mike Dunlavey Mar 24 '11 at 13:57
  • @Mike: I made absolutely no statement to what kind of profiler to use or it's features, except that it should be able to profile only the critical section. So your desire for a particular profiler doesn't contradict anything I said. – Lennart Regebro Mar 24 '11 at 14:54
  • @lennart: Right, you left the type of profiler open. It's just that you said "The only way to find the performance problems is to record every call done in the application and how long it took. This is what a profiler does." You're not alone in thinking this way, and that's what I think gets between people and the solving of their performance problems. – Mike Dunlavey Mar 24 '11 at 15:01
  • This is somewhat dependent on languages. In OO langauges as Java and Python the difference between recording function calls and lines of code is astonishingly small. In C it's obviously different. But it's still unusual that you need more granularity than function calls, as it's very unusual that single lines of code take any significant time. It can be useful if you have long functions with multiple loops to help you find which loop takes time, but then it can be argued that your functions should be shorter. ;) – Lennart Regebro Mar 24 '11 at 15:07
  • @Lennart: [Here's my favorite example.](http://stackoverflow.com/questions/926266/performance-optimization-strategies-of-last-resort/927773#927773) The only way it could not apply in a language is if the language didn't have function calls. A single line of code, if it contains a function call, or if it doesn't, is significant if it's active (on stack) a good % of time. I would argue that length of functions should be a matter of maintainability, not profile-ability. :) – Mike Dunlavey Mar 24 '11 at 15:27
  • Lines of code that doesn't contain calls are not active a good % of the time unless they are in a function that is called often, or in a loop. Hence what I said above applies. Short functions are also more maintainable, so it's a win-win. – Lennart Regebro Mar 24 '11 at 16:50
1

I assume there is a phase in the app's execution that takes too long - i.e. it makes you wait. I assume what you really want is to see what you could change to make it faster.

A technique that works is random-pausing. You run the app under the debugger, and in the part of its execution that makes you wait, pause it, and examine the call stack. Do this a few times.

Here are some ways your program could be spending more time than necessary.

  • I/O that you didn't know about and didn't really need.
  • Allocating and releasing objects very frequently.
  • Runaway notifications on data structures.
  • others too numerous to mention...

No matter what it is, when it is happening, an examination of the call stack will show it. Once you know what it is, you can find a better way to do it, or maybe not do it at all.

If the program is taking 5 seconds when it could take 1 second, then the probability you will see the problem on each pause is 4/5. In fact, any function call you see on more than one stack sample, if you could avoid doing it, will give you a significant speedup. AND, nearly every possible bottleneck can be found this way.

Don't think about function timings or how many times they are called. Look for lines of code that show up often on the stack, that you don't need.

Example Added: If you take 5 samples of the stack, and there's a line of code appearing on 2 of them, then it is responsible for about 2/5 = 40% of the time, give or take. You don't know the precise percent, and you don't need to know. (Technically, on average it is (2+1)/(5+2) = 3/7 = 43%. Not bad, and you know exactly where it is.)

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
1

The methods you're describing, and many of the comments, seem to me to be relatively weak probabilistic attempts to understand the performance impact. Profilers do work perfectly well for GUIs and other idle-thread programs, though it takes a little practice to read them. I think your best bet is there, though -- learn to use the profiler better, that's what it's for.

The specific use you describe would simply be to attach the profiler but don't record yet. Navigate the GUI to the point in question. Hit the profiler record button, do the action, and stop the recording. View the results. Fix. Do it again.

Scott Stafford
  • 43,764
  • 28
  • 129
  • 177