I would like to submit jobs to a computer cluster via the scheduler SGE using a pipe:
$ echo -e 'date; sleep 2; date' | qsub -cwd -j y -V -q all.q -N test
(The queue might be different depending on the particular cluster.)
Running this command-line in a bash terminal works for me on the cluster I have access to, with GNU bash version 3.2.25, GE version 6.2u5 and Linux 2.6 x86_64.
In Python 2.7.2, here are my commands (the whole script is available as a gist):
import subprocess
queue = "all.q"
jobName = "test"
cmd = "date; sleep 2; date"
echoArgs = ["echo", "-e", "'%s'" % cmd]
qsubArgs = ["qsub", "-cwd", "-j", "y", "-V", "-q", queue, "-N", jobName]
Case 1: using shell=True
makes it work:
wholeCmd = " ".join(echoArgs) + " | " + " ".join(qsubArgs)
out = subprocess.Popen(wholeCmd, shell=True, stdout=subprocess.PIPE)
out = out.communicate()[0]
jobId = out.split()[2]
But I would like to avoid that for security reasons explained in the official documentation.
Case 2: using the same code as above but with shell=False
results in the following error message, so that the job is not even submitted:
Traceback (most recent call last):
File "./test.py", line 22, in <module>
out = subprocess.Popen(cmd, shell=False, stdout=subprocess.PIPE)
File "/share/apps/lib/python2.7/subprocess.py", line 679, in __init__
errread, errwrite)
File "/share/apps/lib/python2.7/subprocess.py", line 1228, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
Case 3: therefore, following the official documentation as well as this on SO, here is one proper way to do it:
echoProc = subprocess.Popen(echoArgs, stdout=subprocess.PIPE)
out = subprocess.check_output(qsubArgs, stdin=echoProc.stdout)
echoProc.wait()
The job is successfully submitted, but it returns the following error message:
/opt/gridengine/default/spool/compute-2-27/job_scripts/3873705: line 1: echo 3; date; sleep 2; date: command not found
This is something I don't understand.
Case 4: another proper way to do it following this is:
echoProc = subprocess.Popen(echoArgs, stdout=subprocess.PIPE)
qsubProc = subprocess.Popen(qsubArgs, stdin=echoProc.stdout, stdout=subprocess.PIPE)
echoProc.stdout.close()
out = qsubProc.communicate()[0]
echoProc.wait()
Here again the job is successfully submitted, but returns the following error message:
/opt/gridengine/default/spool/compute-2-32/job_scripts/3873706: line 1: echo 4; date; sleep 2; date: command not found
Did I make mistakes in my Python code? Could the problem come from the way Python or SGE were compiled and installed?