1

I'm developing Go 1.2 on Windows 8.1 64 bit. I had many issues getting the go pprof tool to work properly such as memory addresses being displayed instead of actual function names.

However, i found profile which seems to do a great job at producing profile files, which work with the pprof tool. My guestion is, how do i use those profile files for graphical visualization?

Dante
  • 10,722
  • 16
  • 51
  • 63

2 Answers2

2

U can try go tool pprof /path/to/program profile.prof to solve function not true problem.

if u want graphical visualization, try input web in pprof.

Specode
  • 973
  • 9
  • 19
1

If your goal is to see pretty but basically meaningless pictures, go for visualization as @Specode suggested.

If your goal is speed, then I recommend you forget visualization. Visualization does not tell you what you need to fix.

This method does tell you what to fix. You can do it quite effectively in GDB.

EDIT in response to @BurntSushi5:

Here are my "gripes with graphs" :)

In the first place, they are super easy to fool. For example, suppose A1 spends all its time calling C2, and vice-versa. Then suppose a new routine B is inserted, such that when A1 calls B, B calls C2, and when A2 calls B, B calls C1. The graph loses the information that every time C2 is called, A1 is above it on the stack, and vice-versa.

enter image description here

For another example, suppose every call to C is from A. Then suppose instead A "dispatches" to a bunch of functions B1, B2, ..., each of which calls C. The graph loses the information that every call to C comes through A.

enter image description here

Now to the graph that was linked:

enter image description here

  • It places great emphasis on self time, making giant boxes, when inclusive time is far more important. (In fact, the whole reason gprof was invented was because self time was about as useful as a clock with only a second-hand.) They could at least have scaled the boxes by inclusive time.

  • It says nothing about the lines of code that the calls come from, or that are spending the self time. It's based on the assumption that all functions should be small. Maybe that's true, and maybe not, but is it a good enough reason for the profile output to be unhelpful?

  • It is chock-full of little boxes that don't matter because their time is insignificant. All they do is take up gobs of real estate and distract you.

  • There's nothing in there about I/O. The profiler from which the graph came apparently embodies that the only I/O is necessary I/O, so there's no need to profile it (even if it takes 90% of the time). In big programs, it's really easy for I/O to be done that isn't really necessary, taking a big fraction of time, and so-called "CPU profilers" have the prejudice that it doesn't even exist.

  • There doesn't seem to be any instance of recursion in that graph, but recursion is common, and useful, and graphs have difficulty displaying it with meaningful measurements.

Just pointing out that, if a small number of stack samples are taken, roughly half of them would look like this:

blah-blah-couldn't-read-it
blah-blah-couldn't-read-it
blah-blah-couldn't-read-it
fragbag.(*structureAtoms).BestStructureFragment
structure.RMSDMem
... a couple other routines

The other half of the samples are doing something else, equally informative. Since each stack sample shows you the lines of code where the calls come from, you're actually being told why the time is being spent. (Activities that don't take much time have very small likelihood of being sampled, which is good, because you don't care about those.)

Now I don't know this code, but the graph gives me a strong suspicion that, like a lot of code I see, the devil's in the data structure.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
  • 1
    I've used the profiler bundled with the Go tool numerous times to optimize my programs successfully. This includes using the "meaningless" call graph that can be produced by the profiler. – BurntSushi5 Jan 04 '14 at 03:28
  • @BurntSushi5: a) Were your programs big, or little? Profilers work great on little programs. b) How much speedup did you get? People are happy with little speedups - they assume that's all there is. [*Here's an example of 730x.*](http://scicomp.stackexchange.com/a/1870/1262) [*Here's one of those meaningful graphs.*](http://stackoverflow.com/q/20536873/23771) – Mike Dunlavey Jan 04 '14 at 15:41
  • A wide range of programs. Some little some big. – BurntSushi5 Jan 04 '14 at 20:11
  • Also, that graph doesn't look very useful. The percentages are all at 100%! Here is the kind of graph that I use, which is produced from the Go toolchain: http://burntsushi.net/stuff/flib-prof.jpg I don't keep track of every speed up that I get, but I frequently write naive code that may take days to run. With the help of the profiler, I've sometimes managed to get it down to hours or minutes. (It's frequently a matter of allocating more intelligently, and the profiler helps a lot with that.) – BurntSushi5 Jan 04 '14 at 20:20
  • Note that I don't claim my approach is better. I merely claim that these graphs *are* meaningful, and my evidence is that I've used them to tune my programs to great effect. – BurntSushi5 Jan 04 '14 at 20:23
  • @BurntSushi5: Well, you pushed my button, so I edited the answer. – Mike Dunlavey Jan 04 '14 at 22:13
  • Thanks for the edit, that's pretty good info. I've reversed my downvote into an upvote. I don't entirely disagree with your analysis, BTW, I just took issue with the "meaningless" moniker. For example, I can take one quick look at that graph and immediately know that all those small boxes are related to parsing an arcane format that biologists like (the right branch). While optimizing that program, I focused mostly on the left branch. (And indeed, getting most of the CPU time in the RMSD functions was the key to optimizing the program from where it was initially.) – BurntSushi5 Jan 05 '14 at 03:32
  • And just so you don't think I'm a complete dolt, I frequently supplement graphs with sampling rates with respect to each line of source code. But the graph usually gives me a great place to start. – BurntSushi5 Jan 05 '14 at 03:33
  • @BurntSushi5: I criticize concepts, not people. We're all in this together. I don't know if you have enough rep to see deleted posts, but I gathered all my observations about profiling [*here*](http://stackoverflow.com/a/1779343/23771), and if you care to see statistical justification, you might look [*here*](http://scicomp.stackexchange.com/a/2719/1262). Answers to critiques [*here*](http://stackoverflow.com/a/18217639/23771). BTW, all that parsing should be I/O bound. If it isn't, there's room for speedup there also. – Mike Dunlavey Jan 05 '14 at 16:44
  • Thanks, those are helpful links (I can only see the latter two). The parsing may be I/O bound, but it's doing a fair bit of work to "fix" certain pieces of the data. This (may) include an application of Needleman-Wunsch sequence alignment for each chain. (Yeah, it's gross.) – BurntSushi5 Jan 06 '14 at 01:22