Check this link and try this method.
The trouble with an example like Mandelbrot is that it is not a very big program. In real-world software the call tree gets much deeper and way more bushy, so you need to find out, per line or instruction, what percent of time it is responsible for, and that is just the percent of time it is on the call stack. So, you need something that samples the call stack and tells you, for each line or instruction that appears there, what percent of samples it is on. You don't need high precision of measurement - that is one of the myths.
There are tools that do this, one is RotateRight/Zoom, and another is LTProf. Personally I swear by the totally manual method.
Over the last couple days, we had a performance problem in some code around here. By the manual method, I found one way to save 40%. Then I found a way to save 40% on top of that, for a total saving of 64%. That's just one example. Here's an example of saving over 97%.
Added: There are social implications of this that can limit the potential speedup. Suppose there are three problems. Problem A (in your code) takes 1/2 of the time. Problem B (in Jerry's code) takes 1/4 of the time, and problem C (in your code) takes 1/8 of the time. When you sample, problem A jumps out at you, and since it's your code, you fix it, and now the program takes 1/2 the original time. Then you sample again, and problem B (which is now 1/2) jumps out at you. You see that it is in Jerry's code, so you have to explain it to Jerry, trying not to embarrass him, and ask him if he could fix it. If he doesn't for whatever reason (like that was some of his favorite code) then even if you fix problem C, time could only be reduced to 3/8 of the original time. If he does fix it, you can fix C and get down to 1/8 of the original time. Then there could be another problem D (yours) that if you fix it could get the time down to 1/16 of the original time, but if Jerry doesn't fix problem B you can't do any better than 5/16. That is how social interaction can be absolutely critical in performance tuning.
The only technique I've seen that works (because it was used on me) is to present the information in a sorrowful, apologetic tone, as if it were your problem, and be persistent about presenting the information. The apologetic tone defuses the embarassment, and the persistence keeps him thinking about it.