2

I really want to know how much time different routines take in my application. I am using GCC 3.4.2 with Dev-C++ IDE and gprof for profiling. Here is the begining of the result file:

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
  7.48      0.89     0.89                             __gnu_cxx::__exchange_and_add(int volatile*, int)
  7.39      1.77     0.88                             _Unwind_SjLj_Register
  6.22      2.51     0.74                             _Unwind_SjLj_Unregister
  3.70      2.95     0.44  2425048     0.00     0.00  rt::wctree_node<std::vector<OPT_Inst, std::allocator<OPT_Inst> > >::get(std::string, bool&)
  3.28      3.34     0.39                             std::string::operator[](unsigned int)
  3.11      3.71     0.37                             std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string()
  2.86      4.05     0.34                             std::string::_M_mutate(unsigned int, unsigned int, unsigned int)
  2.69      4.37     0.32                             __gnu_cxx::__atomic_add(int volatile*, int)
  2.61      4.68     0.31    38655     0.00     0.00  SPSBase::containerBoxFillSet(double, double, double, double)

Can someone explain to me the first ones except rt::wctree (that are obviously not made by me), where do they come from and what is their goal in the program?

Mat
  • 202,337
  • 40
  • 393
  • 406
Vladimir Gazbarov
  • 860
  • 1
  • 10
  • 25

1 Answers1

1

The two _Unwind look to me like exception handling.

The _M_mutate seems to indicate that you are copying strings (detail of implementation of the Copy on Write behavior of libstdc++ implementation) which seems to be underlined by the presence of the string destructor in the profile.

I guess the atomic operations also come from the string COW behavior, since the internal buffer is reference counted.

So it seems the bulk of your time is spent copying std::string around.

EDIT: okay, so now look at your rt::wctree<>::get(std::string, bool&). The parameter is passed by copy. 2425048 calls, 2425048 copies. Why don't you try a const& here ?

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • How about the echange_and_add thing? I tried to avoid exception handling by using -fno-exceptions and it does not seem to work. What am I doing wrong? – Vladimir Gazbarov Jul 14 '12 at 16:21
  • @VladimirGazbarov: both `__exchange_and_add` and `__atomic_add` are atomic operations. As for doing it wrong, it's quite hard to tell: I am *guessing* here. I'll add a new tidbit though, if it pleases you ;) – Matthieu M. Jul 14 '12 at 17:04
  • Regarding the string being passed, I am pretty sure that strings have a shared pointer in it, that only when asked to change, perform the copy. I will test though. Here is another important conclusion: adding elements to a std::list causes all the SjLj unWind stuff which is indeed related to exceptions, and it takes around x10 times more than adding them to a std::vector and std::vector does not use exception handling. – Vladimir Gazbarov Jul 15 '12 at 06:58
  • @VladimirGazbarov: the libstdc++ `string` (coming with gcc) do not use a `shared_ptr` internally, but they do use something close to it (reference counted interal buffer). Note that `list` are much slower than `vector` because each node is allocated separately in memory. – Matthieu M. Jul 15 '12 at 11:56