0

I'm teaching a course on C++ 11 and Linux this fall, and wanted to show the students gprof, so I built a little "word count" program with multiple threads; it runs blazingly quickly and mostly spends its time incrementing counters in a std::map<string, int>.

The issue is gprof: A natural thing is to show them how a profiler helps us optimize, but when I run gprof it crashes (surprises me, given the maturity of the tool!). The very long C++-generated method names must be causing some form of buffer overflow. Here's an example (it sort of survives the flat trace and call-graph printouts, but the output is unintelligible, then dies printing the index):

Flat profile:

Each sample counts as 0.01 seconds.
 no time accumulated

  %   cumulative   self              self     total
 time   seconds   seconds    calls  Ts/call  Ts/call  name
  0.00      0.00     0.00  6015772     0.00     0.00  std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_get_insert_hint_unique_pos(std::_Rb_tree_const_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
  0.00      0.00     0.00      151     0.00     0.00  std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_erase(std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >*)
  0.00      0.00     0.00        9     0.00     0.00  std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_get_insert_unique_pos(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)
  0.00      0.00     0.00        1     0.00     0.00  _GLOBAL__sub_I_lock
  0.00      0.00     0.00        1     0.00     0.00  std::_Rb_tree<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int>, std::_Select1st<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> >, DefineSortOrder, std::allocator<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> > >::_M_get_insert_unique_pos(std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)

... call-graph analysis:

granularity: each sample hit covers 2 byte(s) no time propagated

index % time    self  children    called     name
                0.00    0.00 6015772/6015772     wcounter(int) [14]
[8]      0.0    0.00    0.00 6015772         std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_get_insert_hint_unique_pos(std::_Rb_tree_const_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [8]
-----------------------------------------------
                             1268806             std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_erase(std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >*) [9]
                0.00    0.00     151/151         __tcf_1 [32]
[9]      0.0    0.00    0.00     151+1268806 std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_erase(std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >*) [9]
                             1268806             std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_erase(std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >*) [9]
-----------------------------------------------
                0.00    0.00       1/9           __tcf_1 [32]
                0.00    0.00       8/9           wcounter(int) [14]
[10]     0.0    0.00    0.00       9         std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_get_insert_unique_pos(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [10]
-----------------------------------------------
                0.00    0.00       1/1           __libc_csu_init [30]
[11]     0.0    0.00    0.00       1         _GLOBAL__sub_I_lock [11]
-----------------------------------------------
                0.00    0.00       1/1           std::map<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, int, DefineSortOrder, std::allocator<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> > >::operator[](std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&) [17]
[12]     0.0    0.00    0.00       1         std::_Rb_tree<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int>, std::_Select1st<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> >, DefineSortOrder, std::allocator<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> > >::_M_get_insert_unique_pos(std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&) [12]
-----------------------------------------------
                              463280             std::_Rb_tree<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int>, std::_Select1st<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> >, DefineSortOrder, std::allocator<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> > >::_M_erase(std::_Rb_tree_node<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> >*) [24]
[24]     0.0    0.00    0.00       0+463280  std::_Rb_tree<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int>, std::_Select1st<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> >, DefineSortOrder, std::allocator<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> > >::_M_erase(std::_Rb_tree_node<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> >*) [24]
                              463280             std::_Rb_tree<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int>, std::_Select1st<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> >, DefineSortOrder, std::allocator<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> > >::_M_erase(std::_Rb_tree_node<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> >*) [24]
-----------------------------------------------

This table describes the call tree of the program....

Index by function name

  [11] _GLOBAL__sub_I_lock     [8] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_get_insert_hint_unique_pos(std::_Rb_tree_const_iterator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [12] std::_Rb_tree<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int>, std::_Select1st<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> >, DefineSortOrder, std::allocator<std::pair<std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const, int> > >::_M_get_insert_unique_pos(std::pair<int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > const&)
  [10] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_get_insert_unique_pos(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [9] std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int>, std::_Select1st<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> > >::_M_erase(std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, int> >*)
ken@kenM6800:~/fast-wc$

This can't be a "new" problem... what are people using to profile C++ applications these days?

To generate an example of this problem, download http://www.cs.cornell.edu/ken/fast-wc.tar, extract into a folder fast-wc, and build per the little README. You then will need to extract the linux-5.8-rc7 source files into a folder in your parent directory, one level up (download here)

Ken Birman
  • 1,088
  • 8
  • 25
  • It may be a bug. What does the class template look like? Can you include a [mcve] in the question so we can run `gprof` on it too? Which version of `gprof` are you using? Which compiler (and version) did you use? – Ted Lyngmo Aug 16 '20 at 02:21
  • I'm using a fresh install of Ubuntu, whatever the current general release is as of yesterday, and gprof would be from apt-get, so again, the current release. As for the program, this example is just using std::map, and a few threads. No new classes at all. The program ran for 13s of user time, then terminates with a success return code of 0 -- something else that points to an issue with gprof (it claims the cumulative time was 0). – Ken Birman Aug 16 '20 at 12:38
  • Ok, so, ubuntu 20.04, g++ 9.3.0 and gprof 2.34 then? If you put the little "word count" program in the question I'll see if I can reproduce the problem on my side. – Ted Lyngmo Aug 16 '20 at 13:17
  • Ted, it isn't such a tiny program. Anyhow, meanwhile I found a case where gprof managed to run (my program has various options... somehow with 24 threads gprof was happier than with 8). But email me as ken@cs.cornell.edu and I'll be happy to share the source code for you. In fact even on this run, which is more useful, gprof -A still exits without any output at all... – Ken Birman Aug 16 '20 at 13:52
  • Ok, but it should be possible to make a [mcve] out of the program and put in the question so more than me can take a look at it. Just exclude all the options not needed and keep the minimal amount of code to reproduce the problem. – Ted Lyngmo Aug 16 '20 at 14:02
  • Ted, I'm sympathetic and agree with you, but my OP was actually asking what people use these days, not looking for someone to fix gprof! I'm trying to prepare a new course for a few hundred students who show up in a couple of days and this isn't really a bug report, even though gprof does seem buggy to me! – Ken Birman Aug 16 '20 at 14:12
  • Well, I've put two links at the end of the OP, one to a tar file with the program, one to the Linux source files this example uses, which are currently wired in (just to avoid having the word-count slowed down by doing the equivalent of a find). – Ken Birman Aug 16 '20 at 14:20
  • Ok, have you made sure it's not the program you're running that has UB? If it has, it could maybe make gprof act strangely. Perhaps valgrind, asan and ubsan will find something. – Ted Lyngmo Aug 16 '20 at 14:23
  • I just downloaded the code and noticed that you link with `-lpthread`. Use `-pthread` instead. See: [Difference between `-pthread` and `-lpthread` while compiling](https://stackoverflow.com/a/23251828/7582247). Also `filenames.hpp` is missing and `fast-wc.hpp` is only one byte. – Ted Lyngmo Aug 16 '20 at 14:57
  • Sorry... I updated the tar file to include the header that defines the set of file names. Thanks for the pthreads suggestion (of course this has nothing to do with gprof). As for fast-wc.hpp that's a placeholder for something I show them in a subsequent class. – Ken Birman Aug 16 '20 at 17:23

0 Answers0