1

I have an action-queue in which I submit different jobs to different servers remotely using ssh. e.g.

ssh s1 job1.py
ssh s2 job2.py

The problem is job1.py and job2.py can take a long time to finish and I do not want my action-queue to block. I wonder how I can somehow reparent my jobs.

My current solution is: job1.py uses subprocess.Popen(['my_actual_job.py']). Using this, the ssh would not block, yet my_actual_job.py never completes. It somehow terminates long before it finishes its tasks. If I do ssh s1 "job1.py 2>1", my_actual_job.py finishes, but it blocks my action-queue.

Anyone knows how can I somehow reparent my child processes (my_actual_job.py), so that the ssh can terminate but my jobs can finish their tasks in background?

I saw PEP 3143: Standard daemon process library, but is there any better and cleaner way of doing it?

I cannot change my ssh command... I need to somehow do it in my job1.py. I did Double Forking, but still doesn't work...

Amir
  • 5,996
  • 13
  • 48
  • 61

4 Answers4

1

Your problem with Popen is that job1.py will terminate without waiting for my_actual_job.py to finish (assuming you're never calling p.join() or similar); my_actual.job.py should then barely get started before being killed.

Have you tried asking the ssh process to background itself with e.g. ssh -n s1 job1.py?

Danica
  • 28,423
  • 6
  • 90
  • 122
  • I have no control over ssh ... I checked: http://code.activestate.com/recipes/66012-fork-a-daemon-process-on-unix/... but that doesn't work either – Amir Jul 24 '12 at 19:53
1

I'd recommend Celery for problems like this. It got really great in latest edition (3.0), with support for eventlet, canvas etc.

If Celery is too much overhead, than eventlet is better solution than Twisted in terms of complexity added to project (like this)

Community
  • 1
  • 1
mSolujic
  • 146
  • 4
0
  1. You can just run your command on a server in the background, the way isn't very obvious but it's easy to use when you know the way. Getting ssh to execute a command in the background on target machine

  2. You can run simple web server which will manage jobs on every machine. With Twisted you should be able to achieve this with little code.

Community
  • 1
  • 1
0

I ended up doing: subprocess.Popen(['nohup','my_actual_job.py']) with sending both stderr and stdout to /dev/null.

Amir
  • 5,996
  • 13
  • 48
  • 61