I've got some inherited in-house C++ code, which when compiled with VC++ on Windows, runs an order of magnitude faster than when compiled with g++ on Linux (5 minutes vs. 2 hours). This remains the case both with and without the "normal" optimization flags, as well as over a few different versions of each compiler and respective platform, all on comparable hardware.
Building a debug/profile version (-g -pg) on Linux with g++, I see the following three areas are consuming most of the time:
% cumulative self self total
time seconds seconds calls Ks/call Ks/call name
31.95 955.93 955.93 3831474321 0.00 0.00 std::_List_const_iterator<xxFile>::operator!=(std::_List_const_iterator<xxFile> const&) const
22.51 1629.64 673.71 3144944335 0.00 0.00 std::_List_const_iterator<xxFile>::operator++()
15.56 2095.29 465.65 686529986 0.00 0.00 std::iterator_traits<std::_List_const_iterator<dtFile> >::difference_type std::__distance<std::_List_const_iterator<xxFile> >(std::_List_const_iterator<xxFile>, std::_List_const_iterator<xxFile>, std::input_iterator_tag)
(The xxFile class consists of ints, floats, doubles, bools, and strings)
My naive guesses are that there's something poorly coded which VC++ is compensating for or that the GNU STL may not be as optimized. I'm currently working on compiling the g++/Linux version with the Boost library, starting with assign/std/list.hpp and the boost::assign namespace.
I'm unable to share the code, but does something obvious (besides my limited C++ experience) jump out as the cause, based on your experience?