I have a python2.7 daemon process using a module from http://www.jejik.com/files/examples/daemon.py
The process is a heavy one with about 40 GB RAM usage and 9 child threads. Server uses RHEL 6.3 with 192 GB RAM and enough CPU power.
After starting the process, it lasts for around 3-7 hours, but then it was killed by someone, might be the kernel. But I could not find any hints in dmesg nor kernel log (which I had manually activated), nothings there. When not starting as daemon, I just got the message in terminal: "killed".
The following precautions have been done:
- resetting the oom score in /proc//oom_score_adj so that the oom killer does not pick the process when sort of resources
- increasing all rlimits (that can be increased) to maximum
- set the process nice/priority higher (prio -15)
This problem exists already before applying these precautions, so they are not responsible for the killing
I also have a mechanism to catch all exception, STDERR, STDOUT and log everythings into a rotated log file. But there was nothing interesting just before the process died.
Modules used within the process among others: oracle_cx, ibm_db, suds, wsgi_utils. But all of them always write logs when errors occured.
Anyone know how to trace back the killing? Who and why?
Thank you in advance