I wrote the following bash script:
function getqsubnumber {
# Return how many simulations ($qsubnumber) are currently running
qsubnumber=`qstat | grep p00 | wc -l`
return $qsubnumber
}
getqsubnumber
qs=$?
if [ $qs -le $X ]
then
echo 'Running one more simulation'
$cmd # submit one more job to the cluster
else
echo 'Too many simulations running ... Sleeping for 2 min'
sleep 120
The idea is that I am submitting jobs on a cluster. If there are more than X
jobs running at the same time, I want to wait for 2 minutes.
The code works for X=50
and for X=200
. For some unknown reason, it doesn't work for X=400
. Any idea why? The script never wait for 2 minutes, it keeps on submitting jobs.