4

I've got some inherited in-house C++ code, which when compiled with VC++ on Windows, runs an order of magnitude faster than when compiled with g++ on Linux (5 minutes vs. 2 hours). This remains the case both with and without the "normal" optimization flags, as well as over a few different versions of each compiler and respective platform, all on comparable hardware.

Building a debug/profile version (-g -pg) on Linux with g++, I see the following three areas are consuming most of the time:

%   cumulative   self              self     total
time   seconds   seconds    calls  Ks/call  Ks/call  name
31.95    955.93   955.93 3831474321     0.00     0.00  std::_List_const_iterator<xxFile>::operator!=(std::_List_const_iterator<xxFile> const&) const
22.51   1629.64   673.71 3144944335     0.00     0.00  std::_List_const_iterator<xxFile>::operator++()
15.56   2095.29   465.65 686529986      0.00     0.00  std::iterator_traits<std::_List_const_iterator<dtFile> >::difference_type std::__distance<std::_List_const_iterator<xxFile> >(std::_List_const_iterator<xxFile>, std::_List_const_iterator<xxFile>, std::input_iterator_tag)

(The xxFile class consists of ints, floats, doubles, bools, and strings)

My naive guesses are that there's something poorly coded which VC++ is compensating for or that the GNU STL may not be as optimized. I'm currently working on compiling the g++/Linux version with the Boost library, starting with assign/std/list.hpp and the boost::assign namespace.

I'm unable to share the code, but does something obvious (besides my limited C++ experience) jump out as the cause, based on your experience?

  • Did you compile with `g++ -O2`? Did you use the latest GCC compiler (4.6.2)? – Basile Starynkevitch Dec 12 '11 at 19:47
  • 1
    Yes, it does, though I'm unable to share the answer ;-) – Kerrek SB Dec 12 '11 at 19:48
  • I have tried -02 optimization and my latest attempt was with GCC 4.6.2 on Fedora 16 (as well as many past versions). – deep_spacing Dec 12 '11 at 19:54
  • And how is the difference between VC++ and g++ with some dummy class used with list? If there is a difference of magnitude as well, it means it is about list, not xxFile, and second -- you can share the code. – greenoldman Dec 12 '11 at 20:04
  • How do the two test machines compare? My first real guess is cache issues -- your program was heavily optimized for the machine you first ran it on, and possibly to the way MSVC++ lays out its lists. –  Dec 12 '11 at 20:05
  • 1
    I could hazard a guess, but guesses are often / usually wrong. You're using `gprof`. Don't use `gprof` - it won't tell you what the real problem is. *[Look here for reasons why, and an alternative.](http://stackoverflow.com/questions/1777556/alternatives-to-gprof/1779343#1779343)* – Mike Dunlavey Dec 12 '11 at 20:10
  • It's hard to help without seeing code fragments. – Has QUIT--Anony-Mousse Dec 12 '11 at 20:33
  • 1
    There are some known bugs in GCC (even 4.6.2) about size() not being constant time for STL collections. – Basile Starynkevitch Dec 12 '11 at 20:38
  • @BasileStarynkevitch: `size()` isn't required to be constant time for lists. Making it so causes other list operations like `splice` to be slow. – Zan Lynx Dec 13 '11 at 00:56

2 Answers2

3

A couple things I'll toss out without knowing anything about your code except that it spends a lot of time iterating lists.

Do you need to use lists?

If the bulk of your time is spent iterating, maybe you should be using a contiguous data structure (array/vector). Unless you require list behavior for some other reason (i.e., lots of insertions elsewhere, which might otherwise dominate your runtime) using arrays should give much better performance due to memory locality. Even though iterating an array and list are both O(n), in practice arrays can be much faster due to taking proper advantage of the CPU's caches. When you go through a list->next you don't know where you might end up, and if you end up page faulting all the time you will have horrible performance.

Are you doing lots of searches? Maybe lots of linear searches?

The fact that operator != is at the top of the list makes it appear this way. Who is calling that and why? If you are doing lots of searching and this dominates your runtime, you may also want to consider a different data structure, perhaps a binary search tree or hash table if you do lots of insertions and lookups.

Suboptimus
  • 309
  • 1
  • 6
0

Because the top items are distance, operator++ and operator!= I believe your problem is using the size() operation on a list.

This is one of those areas where the C++ standard allows different implementations. The MSVC implementation stores the size of a list as a number, so returning the value is very fast. GCC does not, so to get the size of a list it must essentially perform:

count = 0;
for(
    list::iterator it = l.begin()
    it != l.end();
    ++it
)
    ++count;

And in that loop you can see each of the operations which top your profile.

Zan Lynx
  • 53,022
  • 10
  • 79
  • 131
  • Thank you! This URL seems to support that ~ http://www.linuxprogrammingblog.com/std-list-size-complexity ~ Not wanting to rewrite the code, I'm going to look into a separate STL. – deep_spacing Dec 13 '11 at 23:54
  • Everyone... I rebuilt the Linux app with boost::interprocess' list, which noted O(1) size() complexity in its header (presumably due to a count variable vs. iteration) and it now operates on par with the Windows version! ~5 minutes, instead of hours! I believe once GNU c++ fully incorporates C++11, they'll match this performance, as constant performance for this function seems defined in the spec. – deep_spacing Dec 14 '11 at 18:46