Context:
I have a linux[1] system that manages a series of third party daemon's with which interactions are limited to shell[2] init scripts, i.e. only {start|restart|stop|status} are available.
Problem:
Processes can assume the PID of a previously running process, the status of processes are checked by inspecting the presence of a running processes with it's PID.
Example:
Process A run's with PID 123, subsequently dies, process B initialises with PID 123 and the status command responds with an unauthentic (erroneous) "OK". In other words, we only check for the presence of a process from its PID to validate that the process is running, we assume that should a process with this PID exist, it is the process in question.
Proposed solutions:
- Interrogate the process, using the PID, to ensure the command/daemon running as that PID is as expected. The problem with this solution is that both the command and PID need to match; multiple bits of information thus need to be maintained and kept in sync, and add addition complexity to error/edge conditions.
- Correlate the creation time of the PID file with the start time of the process, if the process is within a certain delta of the PID file creation time, we can be fairly certain that the command/daemon running is as expected.
Is there a standard way to ratify the authenticity of a process/PID file, beyond presence of a process running with that PID? I.e. I (as the system) want to know if you (the process) are running and if you are who I think you are (A and not B).
Assuming we have elected to implement the second solution proposed above, what confidence interval/delta between the PID creation time and process start time is reasonable? Here, reasonable means acceptable compromise between type 1 / type 2 errors.
[1] CentOS/RHEL [2] Bash