0

I submitted a set(~20 jobs) of processes (mpirun -n 16 executable) through a bash script (say ./run.sh &) which takes a few days to complete. After submission, I exited the terminal. Now when I log back in and top, I can only see the PID of the current process. How can I determine the PID of the run.sh so that I can kill run.sh that will terminate the batch jobs instead of killing each of those 20 jobs individually.

Thank you.

rojo
  • 24,000
  • 5
  • 55
  • 101
gogo
  • 219
  • 4
  • 13

1 Answers1

1

After each command execute the following:

LAST_CMD_PID=$!

This variable will hold the PID of the last executed command.

A common practice is saving the pid in a file with suffix ".pid" for later usage (for more info see this post).

In order to kill run.sh you could get its PID with $! and kill it. BUT killing run.sh will also kill all the commands that run.sh triggered because they were forked from it (i.e. they are its children) (see this post for more info). In order the children to remain alive they should become daemons, i.e. to not have a parent process. In order to do this check they should be executed with the command nohup:

nohup your_cmd_here &
Community
  • 1
  • 1
Mike Argyriou
  • 1,250
  • 2
  • 18
  • 30
  • thanks but when I have executed some other commands and exited the session, I don't think "$!" would store anything there. Is that case how do I get the PID? – gogo Sep 10 '15 at 13:27
  • 1
    If you have exited the session then run.sh dies and therefore is it doesn't have a PID! But if you put it in the background then $! will return its PID. Keep in mind that all the commands that you execute in a shell are forked from the shell and therefore they are children of the shell (except of 'course if you use the nohup command as I have already mentioned). – Mike Argyriou Sep 10 '15 at 13:32