22

I have a server application that creates a Bus on the dbus and after some minutes of running I got an error that I have never seen before. Did you have an idea whats wrong?

*** longjmp causes uninitialized stack frame ***: /home/user/Workspace/DBus_Server/Debug/DBus_Server terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x37)[0x7f8d8911c7f7]
/lib/x86_64-linux-gnu/libc.so.6(+0xf8789)[0x7f8d8911c789]
/lib/x86_64-linux-gnu/libc.so.6(__longjmp_chk+0x33)[0x7f8d8911c6f3]
/usr/lib/x86_64-linux-gnu/libcurl-nss.so.4(+0xd795)[0x7f8d88272795]
/lib/x86_64-linux-gnu/libc.so.6(+0x36420)[0x7f8d8905a420]
/lib/x86_64-linux-gnu/libc.so.6(__poll+0x53)[0x7f8d890f9773]
/usr/lib/libdbus-c++-1.so.0(_ZN4DBus15DefaultMainLoop8dispatchEv+0x161)[0x7f8d89b6b481]
/usr/lib/libdbus-c++-1.so.0(_ZN4DBus13BusDispatcher5enterEv+0x63)[0x7f8d89b6c293]
/home/user/Workspace/DBus_Server/Debug/DBus_Server[0x401333]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7f8d8904530d]
/home/user/Workspace/DBus_Server/Debug/DBus_Server[0x4011c9]
Jason R
  • 11,159
  • 6
  • 50
  • 81
Tobi Weißhaar
  • 1,617
  • 6
  • 26
  • 35
  • [This thread](http://permalink.gmane.org/gmane.comp.systems.archos.rockbox.cvs/32841) suggests that it means that you tried to longjmp to a stack frame that already exited. – Raymond Chen Feb 08 '12 at 11:11
  • 11
    I solved the error...It seems it is a libcurl bug and by setting curl_easy_setopt(curl, CURLOPT_NOSIGNAL, 1) the error do not occur anymore – Tobi Weißhaar Feb 08 '12 at 16:20
  • 2
    Put your answer inside an answer and accept it. I had the same problem and solved it with the solution you wrote. Maybe someone else will find this question as well when googling.. – getekha Mar 28 '12 at 15:12
  • Seems to be fixed in Debian unstable: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=570436#74 - version 7.32.0-1 – jaywink Aug 30 '13 at 20:01

3 Answers3

39

I ran into the same issue; as noted above, it is a curl bug. I thought I would put an answer up here to pull together all of the available information on the problem.

From the Red Hat bug report:

libcurl built without an asynchronous resolver library uses alarm() to time out DNS lookups. When a timeout occurs, this causes libcurl to jump from the signal handler back into the library with a sigsetjmp, which effectively causes libcurl to continue running within the signal handler. This is non-portable and could cause problems on some platforms. A discussion on the problem is available at http://curl.haxx.se/mail/lib-2008-09/0197.html

The "problems on some platforms" apparently refers to crashes on modern Linux systems at least. Some deeper technical details are at the link from the quote above:

There's a problem with the way libcurl currently handles the SIGALRM signal. It installs a handler for SIGALRM to force a synchronous DNS resolve to time out after a specified time, which is the only way to abort such a resolve in some cases. Just before the the DNS resolve takes place it initializes a longjmp pointer so when the signal comes in the signal handler just does a siglongjmp, control continues from that saved location and the function returns an error code.

The problem is that all the following control flow executes effectively inside the signal handler. Not only is there a risk that libcurl could call an async handler unsafe function (see signal(7)) during this time, but it could call a user callback function that could call absolutely anything. In fact, siglongjmp() itself is not on the POSIX list of async-safe functions, and that's all the libcurl signal handler calls!

There are a couple ways to solve this problem, depending upon whether you built libcurl or if you're stuck with one that was provided by your distribution or system admin:

  • If you can't rebuild libcurl, then you can call curl_easy_setopt(curl, CURLOPT_NOSIGNAL, 1) on all curl handles that you use. The documentation for CURLOPT_NOSIGNAL notes:

    Pass a long. If it is 1, libcurl will not use any functions that install signal handlers or any functions that cause signals to be sent to the process. This option is mainly here to allow multi-threaded unix applications to still set/use all timeout options etc, without risking getting signals. (Added in 7.10)

    If this option is set and libcurl has been built with the standard name resolver, timeouts will not occur while the name resolve takes place. Consider building libcurl with c-ares support to enable asynchronous DNS lookups, which enables nice timeouts for name resolves without signals.

    DNS timeouts are obviously desirable to have in most cases, so this isn't a perfect fix. If you have the ability to rebuild libcurl on your system, then you can...

  • There is an asynchronous DNS resolver library called c-ares that curl is capable of using for name resolution. Using this library is the preferred solution to the problem (and I would imagine most Linux packagers have figured this out by now). To enable c-ares support, first build and install the library, then pass the --enable-ares flag to curl's configure script before you build. Full instructions are here.

Jason R
  • 11,159
  • 6
  • 50
  • 81
  • Is it possible to reproduce this error from a terminal ? My process crash gives a similar stack trace but I can't verify if the problem is the same. – asloob Aug 04 '15 at 01:17
  • Note that using c-ares doesn't disable curl's alert based timeouts. You should still set CURLOPT_NOSIGNAL option. – Pawel Veselov Jan 24 '17 at 21:43
  • Your solution fixed the problem for me, I have a C threads program that uses libcurl on multi threads `curl v.7.64.1` curl features `Features: AsynchDNS HTTPS-proxy IPv6 Largefile libz NTLM NTLM_WB SSL TLS-SRP Uni xSockets` OS: `centos 6.10`, **this problem reproduced with me ONLY if I ran the program from a cron job, but if I executed the C program from PHP files it doesn't reproduce** – Accountant م May 17 '19 at 22:00
2

This should be fixed in curl 7.32.0 according to the Debian changelog where threaded DNS resolver has been implemented. The Debian package is in unstable and can be found here.

For Ubuntu 12.04 -> 13.04 you can use this PPA.

sudo apt-add-repository ppa:jaywink/curldebian
sudo apt-get update && sudo apt-get upgrade

Ubuntu 13.10 includes curl 7.32 so should not have this problem.

jaywink
  • 280
  • 3
  • 14
  • 1
    I've updated curl so when i call `curl --version` i see that it's version is 7.32.0 but I still have the same problem. In 13.04 though everything works fine. – Dmitrii Mikhailov Dec 15 '13 at 15:05
  • Sorry which version of Ubuntu does the problem occur, 13.10? – jaywink Dec 17 '13 at 09:31
  • If I'm reading launchpad correctly it seems to me that the threaded DNS resolver maybe never ended up in the launchpad version of 7.32 in saucy ... will do more checking – jaywink Dec 17 '13 at 09:40
0

Even though the discussion points that the issue should be solved with the curl version 7.32, I was getting the crash even with curl version of 7.52-DEV on ubuntu 18.04 with gazebo.

A backtrace of the crash on running gazebo with gdb:

*** longjmp causes uninitialized stack frame ***: /home/$USER/gazebo-11.8.1/bin/gzclient terminated

Thread 1 "gzclient" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51  ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) backtrace
#0  0x00007ffff5397fb7 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff5399921 in __GI_abort () at abort.c:79
#2  0x00007ffff53e2967 in __libc_message (action=action@entry=(do_abort | do_backtrace), fmt=fmt@entry=0x7ffff550f8fb "*** %s ***: %s terminated\n") at ../sysdeps/posix/libc_fatal.c:181
#3  0x00007ffff548db8f in __GI___fortify_fail_abort (need_backtrace=need_backtrace@entry=true, msg=0x7ffff550f8b0 <longjmp_msg> "longjmp causes uninitialized stack frame") at fortify_fail.c:33
#4  0x00007ffff548dbb1 in __GI___fortify_fail (msg=<optimized out>) at fortify_fail.c:44
#5  0x00007ffff548da4d in ____longjmp_chk () at ../sysdeps/unix/sysv/linux/x86_64/____longjmp_chk.S:100
#6  0x00007ffff548d9ab in __longjmp_chk (env=0x7ffff012fb40 <curl_jmpenv>, val=<optimized out>)
    at ../setjmp/longjmp.c:39
#7  0x00007fffefec8745 in  () at /usr/local/lib/libcurl.so
#8  0x00007ffff5398040 in <signal handler called> () at /lib/x86_64-linux-gnu/libc.so.6
#9  0x00007ffff546dcb9 in __GI___poll (fds=0x555559cbe2f0, nfds=4, timeout=7)
    at ../sysdeps/unix/sysv/linux/poll.c:29
#10 0x00007fffed4c56e9 in  () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x00007fffed4c57fc in g_main_context_iteration () at /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x00007ffff694a88f in QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>) ()
    at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#13 0x00007ffff68ef90a in QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>) ()
    at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#14 0x00007ffff68f89b4 in QCoreApplication::exec() () at /usr/lib/x86_64-linux-gnu/libQt5Core.so.5
#15 0x00007ffff725a856 in gazebo::gui::run(int, char**) (_argc=<optimized out>, _argv=0x7fffffffb4a8)
    at /home/$USER/gazebo-11.8.1/source/gazebo/gui/GuiIface.cc:442
#16 0x00005555555579f4 in main(int, char**) (_argc=1, _argv=0x7fffffffb4a8)
    at /home/$USER/gazebo-11.8.1/source/gazebo/gui/main.cc:32

Looking closely at the crash, libcurl.so was accessed in frame #7 and then the crash happened eventually (frames #6 to #0)

This crash usually happens within 15 minutes of running gazebo, even when the program is left running idle with an empty simulation, i.e., no model loaded in the simulation and no computation happening, just the gazebo client (gzclient) running by itself with gazebo server (gzserver) running on another shell

On checking the curl version ( using curl --version) in my ubuntu system I get

curl 7.52.1-DEV (Linux) libcurl/7.52.1-DEV OpenSSL/1.0.2n zlib/1.2.11
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp smb smbs smtp smtps telnet tftp 
Features: IPv6 Largefile NTLM SSL libz UnixSockets HTTPS-proxy

The curl version above is clearly higher than 7.32 but I still get the crash.

Test with curl

I removed the curl shipped by default with ubuntu 18.04 LTS and installed the latest curl ( version 7.79.1 as of 17.09.2021) from source :

git clone https://github.com/curl/curl.git
cd curl
./buildconf
./configure --with-{dict,file,ftp,ftps,gopher,gophers,http,https,imap,imaps,mqtt,pop3,pop3s,rtsp,smb,smbs,smtp,smtps,telnet,tftp}
make
sudo make install
sudo ldconfig

After this I again ran my program and left it running overnight, this time with a simulation doing computation in a loop, for more than 15 hours and in the morning the simulation was running.

So for me the problem of gazebo crashing seems solved with new version of curl.

ggulgulia
  • 2,720
  • 21
  • 31