4

I use a third library in my c++ program which under certain circumstances emits SIGABRT signal. I know that trying to free non-initialized pointer or something like this can be the cause of this signal. Nevertheless I want to keep running my program after this signal is emitted, to show a message and allow the user to change the settings, in order to cope with this signal.
(I use QT for developing.)

How can I do that?

Yunnosch
  • 26,130
  • 9
  • 42
  • 54
  • 1
    Possible duplicate of [Basic signal handling in C++](https://stackoverflow.com/questions/3817631/basic-signal-handling-in-c) – user4581301 Feb 23 '18 at 06:19
  • 3
    Mind you, sigabort should only be called when a catastrophic is detected. Continuing after this is generally not recommended. Crom only knows if the program can be continued without crashing or getting really weird. – user4581301 Feb 23 '18 at 06:23
  • 4
    `SIGABRT` indicates that program detected some critical condition and can not continue. Instead of trying to silence it you should figure out the cause and fix it. – user7860670 Feb 23 '18 at 06:24
  • @user4581301 thanks, but the answerer that question is used just a infinite loop, which is not a real problem. – morteza ali ahmadi Feb 23 '18 at 06:33
  • thanks @VTT, when the user set unnormal settings for the third party library functions, this event happens. So I should change the source code of that lib and I can't. – morteza ali ahmadi Feb 23 '18 at 06:36
  • instead of registering SIGHUP, register SIGABRT and handle it as you wish. When the program returns to your control from the signal, assuming it doesn't instantly die, you can check the flag and emit a warning. – user4581301 Feb 23 '18 at 06:38
  • @user4581301 I know what you say, but the program is waiting for closure, so whatever i do, after that the program will be closed. – morteza ali ahmadi Feb 23 '18 at 06:44

4 Answers4

6

I use a third library in my c++ program which under certain circumstances emits SIGABRT signal

If you have the source code of that library, you need to correct the bug (and the bug could be in your code).

BTW, probably SIGABRT happens because abort(3) gets indirectly called (perhaps because you violated some conventions or invariants of that library, which might use assert(3) - and indirectly call abort). I guess that in caffe the various CHECK* macros could indirectly call abort. I leave you to investigate that.

If you don't have the source code or don't have the capacity or time to fix that bug in that third party library, you should give up using that library and use something else.

In many cases, you should trust external libraries more than your own code. Probably, you are abusing or misusing that library. Read carefully its documentation and be sure that your own code calling it is using that library correctly and respects its invariants and conventions. Probably the bug is in your own code, at some other place.

I want to keep running my program

This is impossible (or very unreliable, so unreasonable). I guess that your program has some undefined behavior. Be very scared, and work hard to avoid UB.

You need to improve your debugging skills. Learn better how to use the gdb debugger, valgrind, GCC sanitizers (e.g. instrumentation options like -fsanitize=address, -fsanitize=undefined and others), etc...

You reasonably should not try to handle SIGABRT even if in principle you might (but then read carefully signal(7), signal-safety(7) and hints about handling Unix signals in Qt). I strongly recommend to avoid even trying catching SIGABRT.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • thanks, I use a third party lib by the name of `caffe` which is used to train neural network in c++. This lib has consul interface that can be run from it. So in consul if I set unnormal settings the program shows the error. But now I use this lib from code and there is no consul. So after illegal settings, SIGABRT emits. By default this lib has no error. – morteza ali ahmadi Feb 23 '18 at 06:55
  • Yes, as I guessed the bug is in your own code. Fix it as soon as possible (e.g. check that the settings are correct before calling functions in `caffee`) – Basile Starynkevitch Feb 23 '18 at 06:57
  • 1
    @mortezaaliahmadi if you produce a [mcve] , and still can't figure out the problem from having constructed the MCVE, you should use the MCVE as the basis for a new question. – user4581301 Feb 23 '18 at 06:59
  • thanks again, ok but `caffe` source codes are so complicated. I try to produce MCVE of that. – morteza ali ahmadi Feb 23 '18 at 07:05
  • @BasileStarynkevitch A third party library is not something "you can trust more than your code" by definition. They are just software developed by someone else. And the example at hand also proves what I think. Sending SIGABRT is a bad design decision for caffee. Why would a third party library abort the application using it? It could return an error, maybe even fall to a state that it will cease processing further inputs from the application, but a library making a decision for the lifetime of the application using it is plain wrong. – simurg Feb 23 '18 at 07:30
  • Not by definition, but in practice (trust more external library than your code), especially for a newbie. BTW, `caffee` is probably not sending `SIGABRT` but simply calling `abort` in some "impossible" cases (perhaps some `assert` failure). Applications are expected to respect the guidelines and conventions of every libraries they are using... – Basile Starynkevitch Feb 23 '18 at 08:30
  • And a good library is expected to gracefully handle any errors introduced by the application using it, not abort it. It can just return an error stating that the input is invalid ("impossible") and cease operation. Why would you want to kill (ok, abort) the process? – simurg Feb 23 '18 at 08:53
  • 2
    @simurg, you do have a point about trusting libraries. OTOH, trust is something that's earnt, and most libraries have had more testing than any code you or I wrote today! We can't assume the library is infallible, but it's less likely to be wrong, and I'd always start bug hunting in my own code. I fully agree that closed-source libraries are no help, and would never recommend using any. – Toby Speight Feb 23 '18 at 09:10
  • 1
    @simurg I'm of three minds on that. On one hand, you want to provide some sort of diagnostic to help the caller sort out what's gone wrong and `abort`s not too good at that by itself. On the other hand you want to make the programmer deal with the problem and not ignore the return code and try to continue. A swift and well-placed boot to the head can save a lot of debugging. On the gripping hand, this sounds like a good place for a `throw`. – user4581301 Feb 23 '18 at 16:46
3

Unfortunately, you can't. SIGABRT signal is itself sent right after abort()

Ref: https://stackoverflow.com/a/3413215/9332965

Ashish Gupta
  • 112
  • 6
  • 2
    You can trap `SIGABRT` as easily as most other signals. Whether you can do anything useful afterwards is a different question. – Toby Speight Feb 23 '18 at 09:11
0

You can handle SIGABRT, but you probably shouldn't.


The "can" is straightforward - just trap it in the usual way, using signal(). You don't want to return from this signal handler - you probably got here from abort() - possibly originally from assert() - and that function will exit after raising the signal. You could however longjmp() back to a state you set up earlier.


The "shouldn't" is because once SIGABRT has been raised, your data structures (including those of Qt and any other libraries) are likely in an inconsistent state and actually using any of your program's state is likely to be unpredictable at best. Apart from exiting immediately, there's not much you can do other than exec() a replacement program to take over in a sane initial state.

If you just want to show a friendly message, then you perhaps could exec() a small program to do that (or just use xmessage), but beware of exiting this with a success status where you would have had an indication of the SIGABRT otherwise.

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
-2

Unfortunately there isn't much you can do to prevent SIGABRT from terminating your program. Not without modifying some code that was hopefully written by you.

You would either need to change code to not throw an abort, or you would have to spawn a new process that runs the code instead of the current process. I do not suggest you use a child process to solve this problem. It's most likely caused by misuse of an api or computer resources, such as low memory.

dhoodlum
  • 1,103
  • 2
  • 9
  • 18
  • thanks @user4581301 just tried it out, and looks like I was completely wrong! I modified the answer. – dhoodlum Feb 23 '18 at 06:54
  • There is one way -- spawn a child process and call the library only from within the child process. That way if the child process crashes, the parent process can handle the crash by spawning another child process, or etc. That said, I really don't recommend doing that; the proper (and much less painful) approach is of course to identify what is causing the crash, and fix that (even if it means working together with the developers of the library to do so). – Jeremy Friesner Feb 23 '18 at 07:07
  • 1
    You can handle `SIGABRT` in the usual way, with `signal()`. – Toby Speight Feb 23 '18 at 09:12
  • @TobySpeight The user is trying to prevent the process from terminating. I tried that using posix methods for handling signal, but when SIGBART is received the process still exits. I know you can prevent the other signals like SIGTERM from terminating the process, but SIGABRT, that's a special signal. `signal()` cannot be used to solve this question. – dhoodlum Feb 23 '18 at 09:26
  • 1
    As long as you *don't return from the signal handler*, you're okay. You could call `longjmp()`, for example. – Toby Speight Feb 23 '18 at 09:56
  • I don’t think that solution applies here, because most programs would not work properly in that state. Given the program is using QT, threads will probably cause problems there. Is it possible to block in a signal while launching multiple threads and child processes? I imagine the OS wouldn’t appreciate that very much :) – dhoodlum Feb 23 '18 at 11:25
  • @TobySpeight: That's the only correct answer to "How can a process continue running after `abort()` / `SIGABRT`? Why don't you post it as one? – Ben Voigt Feb 24 '18 at 20:16