5

What is the technique to log the segmentation faults and run time errors which crash the program, through a remote logging library?

The language is C++.

Aquarius_Girl
  • 21,790
  • 65
  • 230
  • 411

4 Answers4

6

Here is the solution for printing backtrace, when you get a segfault, as an example what you can do when such an error happens.

That leaves you a problem of logging the error to the remote library. I would suggest keeping the signal handler, as simple, as possible and logging to the local file, because you cannot assume, that previously initialized logging library works correctly, when segmentation fault occured.

Community
  • 1
  • 1
Rafał Rawicki
  • 22,324
  • 5
  • 59
  • 79
  • The local file can be send via the remote logging later, probably during the next time your application starts (it will be pretty quick if you have a watchdog). – Rafał Rawicki May 08 '12 at 10:35
1

I'd like to give some solutions:

  1. using core dump and start a daemon to monitor and collect core dumps and send to your host.
  2. GDB (with GdbServer), you can debug remotely and see backtrace if crashed. enter image description here
Flexo
  • 87,323
  • 22
  • 191
  • 272
wuliang
  • 749
  • 5
  • 7
1

What is the technique to log the segmentation faults and run time errors which crash the program, through a remote logging library?

From my experience, trying to log (remotely or into file) debugging messages while program is crashing might not be very reliable, especially if APP takes system down along with it:

  1. With TCP connection you might lose last several messages while system is crashing. (TCP maintains data packet order and uses error correction, AFAIK. So if app just quits, some data can be lost before being transmitted)
  2. With UDP connection you might lose messages because of the nature of UDP and receive them out-of-order
  3. If you're writing into file, OS might discard most recent changes (buffers not flushed, journaled filesystem reverting to earlier state of the file).
  4. Flushing buffers after every write or sending messages via TCP/UDP might induce performance penalties for a program that produces thousands of messages per second.

So as far as I know, the good idea is to maintain in-memory plaintext log-file and write a core dump once program has crashed. This way you'll be able to find contents of log file within core dump. Also writing into in-memory log will be significantly faster than writing into file or sending messages over network. Alternatively, you could use some kind of "dual logging" - write every debug message immediately into in-memory log, and then send them asynchronously (in another thread) into log file or over the network.

Handling of exceptions:

Platform-specific. On windows platform you can use _set_se_handlers and use it to either generate backtrace or to translate platform exceptions into c++ exceptions.

On linux I think you should be able to create a handler for SIGSEGV signal.

While catching segfault sounds like a decent idea, instead of trying to handle it from within the program it makes sense to generate core dump and bail. On windows you can use MiniDumpWriteDump from within the program and on linux system can be configured to produce core dumps in shell (ulimit -c, I think?).

Community
  • 1
  • 1
SigTerm
  • 26,089
  • 6
  • 66
  • 115
  • Have any C++ program make system crashed? – wuliang May 08 '12 at 17:28
  • @wuliang: **YES**, and the answer is **very** obvious. No complex system is ever completely bugfree, so it is possible to hit a system/driver bug which will crash entire system along with the program. Especially if you're dealing with some kind of hardware (CUDA/DirectX/OpenGL). – SigTerm May 08 '12 at 17:42
  • Since linux kernel has no C++ implemented module, we may just assume C++ are all user domain application. If user domain can make kernel crash, it just means a bad design of User/Kernel API. I think most (CUDA/DirectX/OpenGL) crash is GUI/X windows crash, not system crash. In fact, there are many tips in kernel debugging such as RAM/NAND logging, but they are common task, not C++ related. – wuliang May 08 '12 at 18:19
  • @wuliang: Platform dependent. "I think most" Not quite correct. That'll be *driver*-level crash. On winOS it'll kill the system, cause BSOD, memory dump (if configured) and reboot. What'll happen on linux in depends on whether corresponding driver is part of X or Kernel and how the kernel behaves in this situation. "we may just assume C++ are all user domain" Irrelevant. If you'll be lucky to discover new bug in kernel/driver, you'll need way to find out what the hell happened and bypass the problem - bugs don't get fixed instantly, and your users will still need your app to work properly. – SigTerm May 08 '12 at 18:36
0

To catch the segfault signal and send a log accordingly, read this post:

Is there a point to trapping "segfault"?

If it turns out that you wont be able to send the log from a signal handler (maybe the crash occurred before the logger has been intitialized), then you may need to write the info to file and have an external entity send it remotely.

EDIT: Putting back some original info to be able to send the core file remotely too

To be able to send the core file remotely, you'll need an external entity (a different process than the one that crashed) that will "wait" for core files and send them remotely as they appear. (possibly using scp) Additionally, the crashing process could catch the segfault signal and notify the monitoring process that a crash has occurred and a core file will be available soon.

Community
  • 1
  • 1
Brady
  • 10,207
  • 2
  • 20
  • 59
  • the external program can't be a part of the remote logging lib, can it be? – Aquarius_Girl May 08 '12 at 10:17
  • Ya, I just added a comment to your question, I originally thought you wanted to send the actual core file remotely. If you just want to log when a crash occurs, then the remote logging should be in the same process. But be careful about what Rafal Rawicki mentions in his answer. – Brady May 08 '12 at 10:20
  • @Anisha, added some of the original info – Brady May 08 '12 at 10:47