10

I am experiencing a very strange behavior, which I distilled down to a very basic test:

#include <string>
#include <filesystem>

int main(void)
{
  const std::string name = "foo";
  const std::filesystem::path lock_dir = "/tmp";
  std::filesystem::path lockfile = lock_dir / name;

  return 0;
}

I compile this with g++ -std=c++17 -Wall -Wextra -Werror -g foo.cpp -o foo. When I run it, I get a std::bad_alloc exception on the line where the two paths are appended. Here's what I see with gdb

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff742c801 in __GI_abort () at abort.c:79
#2  0x00007ffff7a8e1f2 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007ffff7a99e36 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007ffff7a99e81 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007ffff7a9a0b5 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007ffff7a907a7 in std::__throw_bad_alloc() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x0000555555558cfe in __gnu_cxx::new_allocator<std::filesystem::__cxx11::path::_Cmpt>::allocate (this=0x7fffffffe080, __n=12297828079348111650) at /usr/include/c++/8/ext/new_allocator.h:102
#8  0x00005555555587d0 in std::allocator_traits<std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::allocate (__a=..., __n=12297828079348111650) at /usr/include/c++/8/bits/alloc_traits.h:436
#9  0x0000555555557f76 in std::_Vector_base<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::_M_allocate (this=0x7fffffffe080, __n=12297828079348111650)
    at /usr/include/c++/8/bits/stl_vector.h:296
#10 0x0000555555558387 in std::_Vector_base<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::_M_create_storage (this=0x7fffffffe080, __n=12297828079348111650)
    at /usr/include/c++/8/bits/stl_vector.h:311
#11 0x00005555555579cf in std::_Vector_base<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::_Vector_base (this=0x7fffffffe080, __n=12297828079348111650, __a=...)
    at /usr/include/c++/8/bits/stl_vector.h:260
#12 0x0000555555556d39 in std::vector<std::filesystem::__cxx11::path::_Cmpt, std::allocator<std::filesystem::__cxx11::path::_Cmpt> >::vector (this=0x7fffffffe080, 
    __x=std::vector of length -1303124922760, capacity -1303124922760 = {...}) at /usr/include/c++/8/bits/stl_vector.h:460
#13 0x000055555555635f in std::filesystem::__cxx11::path::path (this=0x7fffffffe060, Python Exception <class 'gdb.error'> There is no member or method named _M_t.: 
__p=...) at /usr/include/c++/8/bits/fs_path.h:166
#14 0x00005555555563c8 in std::filesystem:: (Python Exception <class 'gdb.error'> There is no member or method named _M_t.: 
__lhs=..., Python Exception <class 'gdb.error'> There is no member or method named _M_t.: 
__rhs=...) at /usr/include/c++/8/bits/fs_path.h:554
#15 0x0000555555555fbe in main () at foo.cpp:8

This brings up several questions:

  1. What is wrong with my test code?
  2. Why does GDB show anything with python in the call stack?

Anticipating the question, my g++ is gcc version 8.3.0 (Ubuntu 8.3.0-6ubuntu1~18.04.1) and my gdb is GNU gdb (Ubuntu 8.2-0ubuntu1~18.04) 8.2

UPDATE Here is the output of ldd for the successfully compiled executable

linux-vdso.so.1 (0x00007ffc697b2000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f5c35444000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f5c3522c000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f5c34e3b000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f5c34a9d000)
/lib64/ld-linux-x86-64.so.2 (0x00007f5c35a2d000)
Marshall Clow
  • 15,972
  • 2
  • 29
  • 45
Paul Grinberg
  • 1,184
  • 14
  • 37
  • 2
    Do you notice any difference if you add `-lstdc++fs` when compiling/linking? – Ted Lyngmo Jun 24 '19 at 14:32
  • @TedLyngmo without it he would have a linking issue. – Marek R Jun 24 '19 at 14:38
  • [Here it works](https://wandbox.org/permlink/yrGK19ZasFCUyZeP) with no problems it must be some problem on your machine. – Marek R Jun 24 '19 at 14:39
  • @MarekR I wasn't sure what issue OP has really and since I think gcc 8 wants `-lstdc++fs` (while gcc 9 seems to be fine without it) I thought it would be worth a try.. – Ted Lyngmo Jun 24 '19 at 14:42
  • the python stuff is just some GDB plugin which failed during debugging (when it tried make some stuff human readable). Does your error appears during regular run or just when you debug your code? – Marek R Jun 24 '19 at 14:42
  • @MarekR - I suspect that my Ubuntu VM may be at fault here. I just did a `apt update && apt upgrade` and perhaps this is somehow related – Paul Grinberg Jun 24 '19 at 14:46
  • This error appears during regular run – Paul Grinberg Jun 24 '19 at 14:46
  • 1
    I wonder how you can compile using gcc 8.3.0 without `-lstdc++fs`. It should fail ... :-/ – Ted Lyngmo Jun 24 '19 at 14:48
  • @TedLyngmo - I added ldd info for the executable. It clearly links and runs, except for the bad_alloc exception which I am trying to track down with this question – Paul Grinberg Jun 24 '19 at 15:04
  • @PaulGrinberg Ok, and if you add `-lstdc++fs` when compiling and then do `ldd`, any difference? – Ted Lyngmo Jun 24 '19 at 15:07
  • @TedLyngmo - no difference on either `ldd` output or crash behavior when I add `-lstdc++fs – Paul Grinberg Jun 24 '19 at 15:11
  • Ok, and what version is `libstdc++.so.6` pointing at? `ls -l /usr/lib/x86_64-linux-gnu/libstdc++.so.6` should show the full version, like `libstdc++.so.6.0.25` for gcc 8 and `libstdc++.so.6.0.26` for gcc 9.. – Ted Lyngmo Jun 24 '19 at 15:18
  • @TedLyngmo - I think you may be onto something: `/usr/lib/x86_64-linux-gnu/libstdc++.so.6 -> libstdc++.so.6.0.26` – Paul Grinberg Jun 24 '19 at 15:23
  • :-) Yeah, it's definitely linked with the gcc 9 lib. That would explain why linking without `-lstdc++fs` worked. No idea how that could happen. Perhaps your development tools didn't get the same update as the libraries when you updated...? odd ... – Ted Lyngmo Jun 24 '19 at 15:25
  • 1
    Here is another data point. In my bad VM, I can compile/link a bad executable with `g++ -std=c++17 -Wall -Wextra -Werror -g foo.cpp -o foo`. When I compile/link with `g++ -std=c++17 -Wall -Wextra -Werror -g foo.cpp -o foo -lstdc++fs` it runs without a problem – Paul Grinberg Jun 24 '19 at 15:26
  • Odd indeed ... I wish I had an answer, but now you know where do start looking at least. :) – Ted Lyngmo Jun 24 '19 at 15:29
  • According to stacktrace, things go wrong when there was an attempt to allocate vector of size `12297828079348111650`. – ks1322 Jun 24 '19 at 16:16
  • Could it be some interaction between GCC 8.3 and the GCC 9 lib? – 1201ProgramAlarm Jun 24 '19 at 17:03
  • The Python business is a bug in GDB’s libraries (which are in Python) for pretty-printing C++ objects mentioned in the traceback. – Davis Herring Jun 24 '19 at 23:23
  • I thought I was on Ubuntu 18.04.2 LTS, so it was a surprise to me to find out that somehow I had `/etc/apt/sources.list.d/ubuntu-toolchain-r-ubuntu-test-bionic.list` which is which is how my system got broken. – Paul Grinberg Jun 25 '19 at 14:11
  • 1
    @TedLyngmo your comments above are spot on and would be 100% correct, if it wasn't for a "feature" of Ubuntu. It's by design that Ubuntu ships inconsistent headers and shared library for libstdc++, and that's why the program links without using `-lstdc++fs`, and that's why it crashes at runtime. See my answer below for more info. – Jonathan Wakely Sep 02 '19 at 16:37

2 Answers2

12

This is caused by a "feature" of Ubuntu, which provides a later libstdc++.so than the one that comes with the system g++. See https://bugs.launchpad.net/ubuntu/+source/gcc-8/+bug/1824721 for more details.

Normally when compiling with GCC 8 the std::filesystem symbols are not present in libstdc++.so and so if you fail to link with -lstdc++fs then you will get a linker error. But because the newer libstdc++.so from GCC 9 does include symbols for std::filesystem, that linker error doesn't happen. Unfortunately, the GCC 9 versions of the filesystem symbols are not compatible with the GCC 8 headers (because the filesystem library was experimental and unstable in GCC 8, and the layout of filesystem::path changed for GCC 9). This means that your program links, but then at runtime it uses the wrong symbols for filesystem::path, and bad things happen.

I didn't anticipate this problem, because I didn't know Ubuntu mixes old libstdc++ headers with a new libstdc++ shared library. That's usually safe to do, except when using "experimental", incomplete features, such as the C++17 features in GCC 8.

The fix I suggested for Ubuntu was to make g++ automatically add -lstdc++fs to the end of your compilation command. If you use any std::filesystem features then the correct definitions for those symbols should be found in GCC 8's libstdc++fs.a (rather than in GCC 9's libstdc++.so) and in most cases everything should Just Work. If Ubuntu didn't update their GCC packages with that workaround yet, you can also make it work by just making sure you manually link with -lstdc++fs (which is documented as required for GCC 8 anyway).

Jonathan Wakely
  • 166,810
  • 27
  • 341
  • 521
  • Whoa... nasty :-) Good to know! I'm running Ubuntu at work. – Ted Lyngmo Sep 02 '19 at 16:42
  • 1
    @TedLyngmo, yeah, what should have been a linker error if you forget `-lstdc++fs` becomes undefined behaviour and a runtime crash. Nasty. According to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90050#c8 and https://bugs.launchpad.net/ubuntu/+source/gcc-8/+bug/1824721/comments/17 it should now be fixed, so be sure you're up to date. – Jonathan Wakely Sep 02 '19 at 16:45
  • My local machine is probably updated then but I should probably check what they run in the containers in the CI machinery where `boost::filesystem` is part of the mix too. – Ted Lyngmo Sep 02 '19 at 16:48
4

I'll summarize my own findings with what other folks found in the comments. That's not an actual answer (yet), since at this time I cannot explain the reason of the failure.

I was able to reproduce this behavior by installing g++-8 and g++-9 inside a regular ubuntu Docker image, so that I had both /usr/bin/g++-8 and /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.26 available.

According to the gdb stack trace, the error happens somewhere in std::vector constructor. Seems like it happens when the default copy constructor for std::filesystem::path is called inside its operator/:

/usr/include/c++/8/bits/fs_path.h

  /// Append one path to another
  inline path operator/(const path& __lhs, const path& __rhs)
  {
    path __result(__lhs);  // <-- fails here
    __result /= __rhs;
    return __result;
  }

This finding makes it possible to simplify the test case even more:

#include <filesystem>

int main(void)
{
  const std::filesystem::path first = "/tmp";
  const std::filesystem::path second(first);

  return 0;
}

which makes it clear that the problem is somewhere in calling the copy constructor.

The only vector in std::filesystem::path is this vector (presumably, of path components):

/usr/include/c++/8/bits/fs_path.h

    struct _Cmpt;
    using _List = _GLIBCXX_STD_C::vector<_Cmpt>;
    _List _M_cmpts; // empty unless _M_type == _Type::_Multi

According to the stack trace, when copying this vector, we immediately get into stl_vector.h:

/usr/include/c++/8/bits/stl_vector.h

      vector(const vector& __x)
      : _Base(__x.size(),
        _Alloc_traits::_S_select_on_copy(__x._M_get_Tp_allocator()))
      {

but if we print the value of __n in the constructor of _Vector_base here:

      _Vector_base(size_t __n, const allocator_type& __a)
      : _M_impl(__a)
      { _M_create_storage(__n); }

we'll get some insanely large number, which makes me think that an incorrect vector __x was somehow passed down to the copy constructor.

Now, why that happens when you combine g++-8 with the libraries of g++-9, I have no idea (for now) and I'm guessing one should go one level deeper if they need to understand the real reason.

But the answer to your main question, I guess, is "The problem is caused by an incompatibility between your compiler and library versions" :)

afenster
  • 3,468
  • 19
  • 26
  • It looks like the constructor calls _M_spit_cmpts which somehow leaves the vector borked. – David Schwartz Jun 25 '19 at 01:36
  • 1
    _"The problem is caused by an incompatibility between your compiler and library versions"_ -- you're right, the problem is that the OP didn't create that incompatibility, Ubuntu did :-( See my answer for the missing pieces of the puzzle. – Jonathan Wakely Sep 02 '19 at 16:35
  • 1
    Thanks @JonathanWakely - your answer should be the accepted one :) – afenster Sep 03 '19 at 18:31