12

I am using gcc 4.8.1 and after hours of debugging a horrible mysterious performance issue I found out that the std::list::size is actually implemented as a call to std::distance.

/**  Returns the number of elements in the %list.  */
      size_type
      size() const _GLIBCXX_NOEXCEPT
      { return std::distance(begin(), end()); }

This surprised me, since the reference says that the complexity of std::list::size should be constant and the complexity of std::distance is linear for std::list::iterator.

I am really confused, since I think gcc has excellent support for C++11 features and I see no reason why they would not implement this one.

Is this an error in the reference or in gcc?

In the latter case:

is there any reason why such a fundamental C++11 feature would be missing for so long?

Is there a third possibility e.g.:

Could I have gcc 4.8.1 but some older version of the standard library?

Martin Drozdik
  • 12,742
  • 22
  • 81
  • 146
  • 2
    It was `O(n)` in C++03, which has been changed in C++11 and now it is `O(1)`. Seems GCC has not updated this bit. Bad. – Nawaz Oct 03 '13 at 08:23
  • 1
    GCC has many standard violations for C++11. `list` and `string` come to mind; it's a hard choice between conforming and breaking old code. – Kerrek SB Oct 03 '13 at 08:26
  • More importantly, *why* are you using `std::list::size()`? – Kerrek SB Oct 03 '13 at 08:26
  • @KerrekSB Is there something wrong with using it? Is it bad style? – Martin Drozdik Oct 03 '13 at 08:28
  • 2
    @MartinDrozdik: It seems algorithmically questionable. Typically you need `empty()`, but `size()` much less often. If your algorithm really calls for a list (rather than, say, a vector), you would normally just query whether there's *something* in the list and pop it out. – Kerrek SB Oct 03 '13 at 08:32
  • @KerrekSB Thank you for your concern and advice. I needed to know the size to reserve space in a vector where I put some data from the list. There is one piece of data for each item in the list. – Martin Drozdik Oct 03 '13 at 08:38
  • 2
    @MartinDrozdik: A `vector` has amortized complexity when using `push_back` so you should be able to let it grow at relatively low-cost (depending on how expensive it is to move the objects within, of course). – Matthieu M. Oct 03 '13 at 08:48
  • 1
    gcc-5 will, by default, have C++11 compliant `std::list` and `std::string`. This is an ABI breaking change as noted by others. There is a mechanism involving inline namespaces that will allow switching. There was effort to make sure old libs don't break. I'm sure this will be spelled out in more detail on release which should be RSN. – emsr Jan 22 '15 at 02:10

1 Answers1

10

This is not exactly a bug and you can read about it here:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49561

It's more of a case of compatibility with older versions of gcc. Looks like they really don't want to add an additional "data member".

Quoting:

This patch made c++98 and c++11 code incompatible and is causing serious problems for distros.

Where the patch is the fix they implemented for gcc 4.7 (it was O(1) in it).

Another quote:

maintaining ABI compatibility has been decided to be more important for the current releases

tomi.lee.jones
  • 1,563
  • 1
  • 15
  • 22
  • I'm wondering why `O(1)` would break older code? Whether it is `O(n)` or `O(1)`, the client code `list.size()` will be same whatsoever. – Nawaz Oct 03 '13 at 08:30
  • @Nawaz: Because implementing it as O(1) requires adding a "size" variable to the `list` class, thus breaking binary compatibility. – Jonathan Potter Oct 03 '13 at 08:34
  • @Nawaz: Because it breaks the binary layout of the class. – Kerrek SB Oct 03 '13 at 08:34
  • @KerrekSB: C++ never promises that. Do the compilers promise that? Is newer GCC 100% compatible with older ones? I don't think so. – Nawaz Oct 03 '13 at 08:36
  • 1
    @Nawaz: They try very hard. Why do you think `string` has been broken for years? There's `__vstring` which is correct, but it's not been made the default string class out of fear of breaking backwards compatibility. – Kerrek SB Oct 03 '13 at 08:37
  • I've done seen pretty icky things for compatibility reasons (e.g. upgrading the phone network to handle 30 digit numbers while sticking to 16 bytes per number). Squirreling away an extra 32 bits, which (while present) would make C++11 `size()` O(1), is not hard in comparison. – MSalters Oct 03 '13 at 13:59
  • 1
    @MSalters Is there any way that really wouldn't break the ABI? keep in mind that the problems are that there's code out there mixing C++98 and C++11 code; as in passing lists between modules, using C++98 list member functions on C++11 lists and vice versa, and probably mixing C++98 binaries compiled with gcc 4.x with C++11 binaries compiled with gcc 4.y. – bames53 Oct 03 '13 at 18:38
  • 1
    @bames53: It's often possible, with in general the proviso that calling a size-altering C++98 method will almost certainly break O(1) size. It would now know about the cached size member, and fail to update it. That does mean the C++11 implementation must detect stale size values, but it can use any kind of dirty implementation hack for that. E.g. set the lowest bit of the pointer to the head node, assuming node pointers are at least 2-byte aligned. – MSalters Oct 04 '13 at 07:28
  • Isn't this *exactly* the time and place to `ifdef` on `--std=...` and do the correct thing for the specified standard? When mixing code compiled with different `--std`, one should expect it to not work. Right? Isn't this the whole point of that switch? – KitsuneYMG Mar 16 '14 at 02:49
  • @MSalters I don't know if this is still current. But wouldn't this change `sizeof(std::list)`, thereby causing all kinds of bugs? – Paul Groke Mar 14 '16 at 15:06
  • @PaulGroke: You can't change `sizeof`, indeed, so that's a hard limit on possible implementations. Having said that, I now realize it's probably impossible for big-O reasons. Since insertion in the middle and `.size()` both have to be O(1), there now must be a direct link between each node and whatever data structure holds the size. No matter the representation of than link, it must be reachable in O(1) steps from each node. This is not possible with a pure forward/back pointer pair. – MSalters Mar 14 '16 at 18:30