I made a C++ tool for off-screen rendering of 3D models. The rendering is done using OSMesa library.
The software was working flawlessly for more than a year, and I stopped to make updates to it something like 6 months ago. In the meanwhile my development environment was updated multiple times.
Now I was compiling it again and found an unexpected bug.
The plain version of the software was still working as expected, but the statically linked one is segfaulting.
I'm assuming that the error is mine in the OSmesa configuration/compilation/linking procedure and not in the library code, but any advice about better debugging of the segmentation fault is appreciated.
Having tried numerous variations of the compilation process without success, I'm now quite stuck. Anyone can see something stupid I'm doing in some of the steps described below?
I recompiled a static version of the OSmesa library with the same version of the shared library that is working in my system (12.0.6), disabling all the non-needed features (using an Ubuntu based system, no static version of OSmesa lib is available from repositories):
./configure \ --disable-xvmc \ --disable-glx \ --disable-dri \ --with-dri-drivers="" \ --with-gallium-drivers="" \ --disable-shared-glapi \ --disable-egl \ --with-egl-platforms="" \ --enable-osmesa \ --enable-gallium-llvm=no \ --disable-gles1 \ --disable-gles2 \ --enable-static \ --disable-shared
This is the compile command of my off-screen rendering tool:
g++ -std=c++11 -Wall -O3 -g -static -static-libgcc -static-libstdc++ ./src/measure_model.cpp model.o thumbnail.o -o measure_model_debug -pthread -lOSMesa -ldl -lm -lpng -lz -lcrypto
This is a warning that I was getting by statically compiling using OSMesa, and it was present even a year ago with the working static binary:
/home/XXX/XXX/backend/lambda/mesa/mesa-12.0.6/src/mesa/main/dlopen.h:52: warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
This is what I get from running the tool:
Segmentation fault (core dumped)
But no segmentation fault is produced if I simply skip the OSmesa context creation step (and obviously all the 3D rendering)
This is the backtrace:
#0 0x0000000000000000 in ?? () #1 0x00000000004af20a in mtx_init (type=4, mtx=0xe10f70) at ../../include/c11/threads_posix.h:215 #2 _mesa_NewHashTable () at main/hash.c:135 #3 0x000000000052f295 in _mesa_alloc_shared_state (ctx=ctx@entry=0xdcc9b0) at main/shared.c:67 #4 0x000000000046e717 in _mesa_initialize_context (ctx=ctx@entry=0xdcc9b0, api=api@entry=API_OPENGL_COMPAT, visual=, share_list=share_list@entry=0x0, driverFunctions=driverFunctions@entry=0x7fffffffcd40) at main/context.c:1192 #5 0x000000000046c870 in OSMesaCreateContextAttribs (attribList=attribList@entry=0x7fffffffd290, sharelist=) at osmesa.c:834 #6 0x000000000046ccdc in OSMesaCreateContextExt (format=, depthBits=, stencilBits=, accumBits=, sharelist=) at osmesa.c:660 #7 0x0000000000468742 in generate_thumbnail(Model*, Json::Value) () #8 0x0000000000401c7d in main (argc=, argv=) at ./src/measure_model.cpp:107
A statically linked binary is a strict requirement.
The segmentation fault is happening on the same machine I use to compile the tool (OSmesa static lib is compiled in the same machine too), but no segmentation fault in the non-statically linked version of the same tool.