3

Executive summary: a Python module is linked against a different version of libstdc++.dylib than the Python executable. The result is that calls to iostream from the module crash.

Backstory

I'm creating a Python module using SWIG on an older computer (running 10.5.8). For various reasons, I am using GCC 4.5 (installed via MacPorts) to do this, using Python 2.7 (installed via MacPorts, compiled using the system-default GCC 4.0.1).

Observed Behavior

To make a long story short: calling str( myObject ) in Python causes the C++ code in turn to call std::operator<< <std::char_traits<char> >. This generates the following error:

Python(487) malloc: *** error for object 0x69548c: Non-aligned pointer being freed
*** set a breakpoint in malloc_error_break to debug

Setting a breakpoint and calling backtrace when it fails gives:

#0  0x9734de68 in malloc_error_break ()
#1  0x97348ad0 in szone_error ()
#2  0x97e6fdfc in std::string::_Rep::_M_destroy ()
#3  0x97e71388 in std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string ()
#4  0x97e6b748 in std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::overflow ()
#5  0x97e6e7a0 in std::basic_streambuf<char, std::char_traits<char> >::xsputn ()
#6  0x00641638 in std::__ostream_insert<char, std::char_traits<char> > ()
#7  0x006418d0 in std::operator<< <std::char_traits<char> > ()
#8  0x01083058 in meshLib::operator<< <tranSupport::Dimension<(unsigned short)1> > (os=@0xbfffc628, c=@0x5a3c50) at /Users/sethrj/_code/pytrt/meshlib/oned/Cell.cpp:21
#9  0x01008b14 in meshLib_Cell_Sl_tranSupport_Dimension_Sl_1u_Sg__Sg____str__ (self=0x5a3c50) at /Users/sethrj/_code/_build/pytrt-gcc45DEBUG/meshlib/swig/mesh_onedPYTHON_wrap.cxx:4439
#10 0x0101d150 in _wrap_Cell_T___str__ (args=0x17eb470) at /Users/sethrj/_code/_build/pytrt-gcc45DEBUG/meshlib/swig/mesh_onedPYTHON_wrap.cxx:8341
#11 0x002f2350 in PyEval_EvalFrameEx ()
#12 0x002f4bb4 in PyEval_EvalCodeEx ()
[snip]

Suspected issue

I believe the issue to be that my code links against a new version of libstdc++:

/opt/local/lib/gcc45/libstdc++.6.dylib (compatibility version 7.0.0, current version 7.14.0)

whereas the Python binary has a very indirect dependence on the system libstdc++, which loads first (output from info shared in gdb):

  1 dyld                  - 0x8fe00000        dyld Y Y /usr/lib/dyld at 0x8fe00000 (offset 0x0) with prefix "__dyld_"
  2 Python                - 0x1000            exec Y Y /opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python (offset 0x0)
                                          (objfile is) /opt/local/bin/python
  3 Python                F 0x219000          dyld Y Y /opt/local/Library/Frameworks/Python.framework/Versions/2.7/Python at 0x219000 (offset 0x219000)
  4 libSystem.B.dylib     - 0x9723d000        dyld Y Y /usr/lib/libSystem.B.dylib at 0x9723d000 (offset -0x68dc3000)
                                 (commpage objfile is) /usr/lib/libSystem.B.dylib[LC_SEGMENT.__DATA.__commpage]
  5 CoreFoundation        F 0x970b3000        dyld Y Y /System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation at 0x970b3000 (offset -0x68f4d000)
  6 libgcc_s.1.dylib      - 0x923e6000        dyld Y Y /usr/lib/libgcc_s.1.dylib at 0x923e6000 (offset -0x6dc1a000)
  7 libmathCommon.A.dylib - 0x94af5000        dyld Y Y /usr/lib/system/libmathCommon.A.dylib at 0x94af5000 (offset -0x6b50b000)
  8 libicucore.A.dylib    - 0x97cf4000        dyld Y Y /usr/lib/libicucore.A.dylib at 0x97cf4000 (offset -0x6830c000)
  9 libobjc.A.dylib       - 0x926f0000        dyld Y Y /usr/lib/libobjc.A.dylib at 0x926f0000 (offset -0x6d910000)
                                 (commpage objfile is) /usr/lib/libobjc.A.dylib[LC_SEGMENT.__DATA.__commpage]
 10 libauto.dylib         - 0x95eac000        dyld Y Y /usr/lib/libauto.dylib at 0x95eac000 (offset -0x6a154000)
 11 libstdc++.6.0.4.dylib - 0x97e3d000        dyld Y Y /usr/lib/libstdc++.6.0.4.dylib at 0x97e3d000 (offset -0x681c3000)
 12 _mesh_oned.so         - 0x1000000         dyld Y Y /Users/sethrj/_code/_build/pytrt-gcc45DEBUG/meshlib/swig/_mesh_oned.so at 0x1000000 (offset 0x1000000)
 13 libhdf5.7.dylib       - 0x122c000         dyld Y Y /opt/local/lib/libhdf5.7.dylib at 0x122c000 (offset 0x122c000)
 14 libz.1.2.5.dylib      - 0x133000          dyld Y Y /opt/local/lib/libz.1.2.5.dylib at 0x133000 (offset 0x133000)
 15 libstdc++.6.dylib     - 0x600000          dyld Y Y /opt/local/lib/gcc45/libstdc++.6.dylib at 0x600000 (offset 0x600000)
[snip]

Note that the malloc error occurs in the memory address for the system libstdc++, not the one the shared library is linked against.

Attempted resolutions

I tried to force MacPorts to build Python using GCC 4.5 rather than the Apple compiler, but the install phase fails because it needs to create a Mac "Framework", which vanilla GCC apparently doesn't do.

Even with the -static-libstdc++ compiler flag, __ostream_insert calls the std::basic_streambuf from the system-loaded shared library.

I tried modifying DYLD_LIBRARY_PATH by prepending /opt/local/lib/gcc45/ but without avail.

What can I do to get this to work? I'm at my wit's end.

More information

This problem seems to be common to mac os x. Notice how in all of the debug outputs show, the address jumps between the calls to std::__ostream_insert and std::basic_streambuf::xsputn: it's leaving the new GCC 4.5 code and jumping into the older shared library code in /usr/bin. Now, to find a workaround...

Seth Johnson
  • 14,762
  • 6
  • 59
  • 85
  • While Python can be really annoying with compiler and library versions, it seems odd that this would be your particular issue. Are you sure your code doesn't do something funny with a pointer (like the string object)? Have you tried assembling your string in a different way (string +, sprintf, etc)? – Adam Sep 26 '11 at 21:58
  • MacPorts port files are designed to install with the Apple-supplied (Xcode) gcc tool chain unless otherwise specified. You'll be battling against the wind to try to change how MacPorts builds Python and all its dependencies. The MacPorts `swig-python` port does not work?` – Ned Deily Sep 26 '11 at 22:24
  • Yes, this is with MacPorts swig-python, and I'm sure that the code is perfectly legitimate. In a standalone C++ executable compiled with identical code, it works fine. I've double-checked every phase in my build chain. (I also use -Wall -Wextra.) – Seth Johnson Sep 26 '11 at 22:41
  • This could be a thread-safety problem, especially if you're changing the locale in the stream conversion? – James Sep 27 '11 at 15:47
  • Similar issues were encountered [here](https://trac.macports.org/ticket/29394) and [here](https://bitbucket.org/brickenstein/polybori/issue/1/non-aligned-pointer-being-freed-on-os-x). The code in __ostream_insert in one shared library calls xsputn in a different shared library, and they appear to be binary incompatible. – Seth Johnson Oct 05 '11 at 14:31
  • Correct me if I'm wrong, but python 2.7 was compiled against GCC4.0 (or 4.2) right? Knowing this, do you think compiling with gcc4.5 might give you trouble? So you would need to recompile Python 2.7 with gcc 4.5 in order to compile your extension? Not sure about this, but I remember having some trouble with something similar. Ref link: http://stackoverflow.com/questions/3552307/why-are-the-python-org-os-x-installers-built-with-gcc-4-0 –  Oct 05 '11 at 14:44
  • @Xavier: Correct. Python was compiled against the Apple-included GCC 4.0. However, because of the way macports is set up, it is impossible for me to recompile it with gcc 4.5, which I already tried. – Seth Johnson Oct 05 '11 at 14:46

2 Answers2

2

Solved it. I discovered that this problem is not too uncommon when mixing GCC versions on the mac. After reading this solution for mpich and checking the mpich source code, I found that the solution is to add the following flag to gcc on mac systems:

-flat_namespace

I am so happy. I wish this hadn't taken me a week to figure out. :)

Seth Johnson
  • 14,762
  • 6
  • 59
  • 85
-3

Run Python in GDB, set a breakpoint on malloc_error_break. That will show you what's being freed that's not allocated. I doubt that this is an error between ABIs between the versions of libstdc++.

Zach Riggle
  • 2,975
  • 19
  • 26
  • 1
    Did you even read the question? That's the output I posted. It's a breakpoint in `malloc_error_break`, being called from the `0x97------` memory space, which is where the older `libstdc++` is loaded. – Seth Johnson Oct 01 '11 at 12:00