0

I am trying to implement a job queuing system like torque PBS on a cluster.

One requirement would be to kill all the subprocesses even after the parent has exited. This is important because if someone's job doesn't wait its subprocesses to end, deliberately or unintentionally, the subprocesses become orphans and get adopted by process init, then it will be difficult to track down the subprocesses and kill them.

However, I figured out a trick to work around the problem, the magic trait is the cpu affinity of the subprocesses, because all subprocesses have the same cpu affinity with their parent. But this is not perfect, because the cpu affinity can be changed deliberately too.

I would like to know if there are anything else that are shared by parent process and its offspring, at the same time immutable

kaspermoerch
  • 16,127
  • 4
  • 44
  • 67
skipper
  • 245
  • 5
  • 15
  • You could [use `prctl()` on Linux, to kill child processes](http://stackoverflow.com/a/19448096/4279). Or [create a new session (/process group)](http://stackoverflow.com/q/4789837/4279) – jfs Feb 14 '14 at 12:03

1 Answers1

0

The process table in Linux (such as in nearly every other operating system) is simply a data structure in the RAM of a computer. It holds information about the processes that are currently handled by the OS.

This information includes general information about each process

  • process id
  • process owner
  • process priority
  • environment variables for each process
  • the parent process
  • pointers to the executable machine code of a process.

Credit goes to Marcus Gründler

Non of the information available will help you out.

But you can maybe use that fact that the process should stop, when the parent process id becomes 1(init).

#!/usr/local/bin/python

from time import sleep
import os
import sys

#os.getppid() returns parent pid
while (os.getppid() != 1):
    sleep(1)
    pass

# now that pid is 1, we exit the program.
sys.exit()

Would that be a solution to your problem?

Community
  • 1
  • 1
brunsgaard
  • 5,066
  • 2
  • 16
  • 15
  • thanks for the answer, while I don't write the actual job script, the user of the queue system does this, so I can't control how the user's script will behave – skipper Feb 14 '14 at 08:58