Why can't I get line by line profiling data for optimized code ? Why am I getting jumps in the lines ? I know that there are compiler optimizations but every assembly line has a c source matched to it. I know I can get profiling of functions, but I want to check how much time each line took. I need to to know when instruction fetch is being done and how many cycle it takes, and where data fetch is being done and to (obviously - ) speed up the code.
-
If an instruction or data fetch takes X% (like 30%) of time, each random pause will land on it with X% probability. You can do that on the optimized code with GDB, for example. Then, trace it back to the guilty line of code. (Note this is not measuring cycles, it is pinpointing costly instructions. The bigger X% is, the fewer samples it takes.) If you really need to know cycles, try *valgrind* or something like that. – Mike Dunlavey Apr 11 '16 at 14:00
2 Answers
This will get downvoted because I'm disagreeing with your question. If you are doing this because you want to speed up the code, it is a mistake to search for speed problems in code that has been scrambled by the optimizer.**
There are two kinds of speedups, ones that you can fix (ex. too many news, calling a function repeatedly with same arguments, hitting DB then throwing away the result, bad algorithm, subterranean logging, too much file opening/closing, unnecessary locking, etc.), and ones that the compiler can fix. The compiler cannot fix your speed problems, and it is not easy for you to do what it does.
To find the kind that you can fix, treat them as bugs to be found with a debugger. Here's a simple method that works. It tells you about lines of code, and a whole lot more. Do it with -O0, i.e. with the compiler's optimizer turned off. As you have seen, the optimizer will not make it any easier to find your speed problems - it will make it harder.
When you cannot find any more speedups that you can fix, then turn on -O3, and let the compiler do its magic.
P.S. Since you're math-inclined, check here.
** I know plenty of people say the opposite, but that's a classroom echo, coming from lecturers who may be very smart, but have little experience in the "trenches". It's based on the wishful thinking that the code is already almost optimal.

- 1
- 1

- 40,059
- 14
- 91
- 135
While the assembly lines indeed have the info about the source line, the/your profiler doesn't merge back the information about instructions into information about the lines.

- 501
- 4
- 15