6

I wrote a C++ server/client pair using C++11, boost::asio and HDF5. The server was running fine for a some time (2 days), and then it stopped with code 137. Since I executed the server with an infinite loop, it was restarted.

Unfortunately, my error logs don't provide sufficient information to understand the problem. So I've been trying to understand what this code means. It seems there's consensus that this means it's an error of 128+9, with 9 meaning that the program was killed with kill -9. Now I'm not sure at all why this happened. I need help to find out.

By reading further, I found out that it could have been killed by the system because it exceeded a certain allowed execution time, and thus the system killed it. Now this is not so unlikely, since my linux server is provided by my university, so they could be applying some kind of security to do this. I read about something called timeout in linux. My first question is: How can I know if this is the cause of the problem?

My second question is: what should I check also to understand this problem? What would you do? Please advise.

If you require any additional information, please ask.

Thanks.

The Quantum Physicist
  • 24,987
  • 19
  • 103
  • 189
  • The exit status is *eight* bits, so it can be 0 to 255 (inclusive). What 137 means could be anything. My guess it's either HDF5 or some other external library (except Boost, which usually throws exceptions instead) that call `exit` on certain error conditions. – Some programmer dude Jul 27 '15 at 17:56
  • Also see http://stackoverflow.com/questions/11801783/resolving-java-result-137, it seems similar. – sashoalm Oct 21 '15 at 11:24
  • https://stackoverflow.com/questions/1041182/why-does-my-perl-script-exit-with-137 – Ciro Santilli OurBigBook.com Jun 08 '17 at 14:08
  • Possible duplicate of [Why does my Perl script exit with 137?](https://stackoverflow.com/questions/1041182/why-does-my-perl-script-exit-with-137) – kenorb Apr 03 '19 at 16:48

1 Answers1

10

Sounds like you have blown through memory limits and your linux memory manager sent SIGKILL to your process. In that case you should check /var/log/messages file to see if there is anything about it. That's the first thing I would do. Check with your sysadmin if you don't have permissions.

shargors
  • 2,147
  • 1
  • 15
  • 21