Questions tagged [sungridengine]

Oracle Grid Engine, previously known as Sun Grid Engine (SGE), CODINE (Computing in Distributed Networked Environments) or GRD (Global Resource Director), is an open source batch-queuing system, developed and supported by Sun Microsystems. Sun once also sold a commercial product based on SGE, known as N1 Grid Engine (N1GE).

Grid Engine was previously developed and supported by Sun Microsystems. Sun once also sold a commercial product based on SGE, known as N1 Grid Engine (N1GE). With the purchase of Sun by Oracle it was forked and there are currently three actively maintained forks: Univa Grid Engine, Son of Grid Engine and Scalable Grid Engine/Open Grid Scheduler.

Until recently Oracle offered a version known as Oracle Grid Engine but support has been transferred to Univa along with the copyrights and it is expected that the Oracle version will be folded into Univa Grid Engine. It was previously known as Sun Grid Engine (SGE), CODINE (Computing in Distributed Networked Environments) or GRD (Global Resource Director), and is an open source batch-queuing system,

The Scalable Grid Engine and Son of Grid Engine versions are open source and free to use under the Sun Industry Standards Source License.

The Univa Grid Engine and Oracle Grid Engine forks are proprietary and apart from time limited demo versions only available with a support contract.

Scalable Logic offers an optional support contract for the Scalable Grid Engine version.

SGE is typically used on a computer farm or high-performance computing (HPC) cluster and is responsible for accepting, scheduling, dispatching, and managing the remote and distributed execution of large numbers of standalone, parallel or interactive user jobs. It also manages and schedules the allocation of distributed resources such as processors, memory, disk space, and software licenses.

SGE is the foundation of the Sun Grid utility computing system, made available over the Internet in the United States in 2006, later becoming available in many other countries.

332 questions
47
votes
9 answers

qstat and long job names

How can I get qstat to give me full job names? I know qstat -r gives detailed information about the task, but it's too much and the resource requirements are included. The qstat -r output is like: 131806 0.25001 tumor_foca ajalali qw …
adrin
  • 4,511
  • 3
  • 34
  • 50
33
votes
3 answers

how to specify error log file and output file in qsub

I have a qsub script as #####----submit_job.sh---##### #!/bin/sh #$ -N job1 #$ -t 1-100 #$ -cwd SEEDFILE=/home/user1/data1 SEED=$(sed -n -e "$SGE_TASK_ID p" $SEEDFILE) /home/user1/run.sh $SEED The problem is-- it puts…
d.putto
  • 7,185
  • 11
  • 39
  • 45
21
votes
3 answers

excluding nodes from qsub command under sge

I have more than 200 jobs I need to submit to and sge cluster. I'll be submitting them into two ques. One of the ques have a machine that I don't want to submit jobs to. How can I exclude that machine? The only thing I found that might be helpful is…
Yotam
  • 10,295
  • 30
  • 88
  • 128
21
votes
4 answers

Empty core dump file after Segmentation fault

I am running a program, and it is interrupted by Segmentation fault. The problem is that the core dump file is created, but of size zero. Have you heard about such a case and how to resolve it? I have enough space on the disk. I have already…
Ali
  • 9,440
  • 12
  • 62
  • 92
17
votes
1 answer

Sun Grid Engine finished job info

Is there a way to list the node which executed a Sun Grid Engine job using qstat or other SGE commands? I have to get this information using a python script. I have figured out how to execute SGE commands from python but I didn't find the solution…
sc3w
  • 1,154
  • 9
  • 21
14
votes
2 answers

Requesting nodes by numbers and their names in SGE

How to request the number of nodes (not procs), while job submission in SGE? for e.g. In TORQUE, we can specify qsub -l nodes=3 How to request the nodes by their names in SGE? for e.g. In TORQUE, we can do this by qsub -l nodes=abc+xyz+pqr, where…
Jayavant
  • 209
  • 1
  • 2
  • 8
13
votes
1 answer

Running a binary without a top level script in SLURM

In SGE/PBS, I can submit binary executables to the cluster just like I would locally. For example: qsub -b y -cwd echo hello would submit a job named echo, which writes the word "hello" to its output file. How can I submit a similar job to SLURM.…
highBandWidth
  • 16,751
  • 20
  • 84
  • 131
12
votes
1 answer

How to list all nodes on SGE cluster?

I am trying to list all nodes on the cluster, but don't know the command. I searched if I use qhost it can list part of nodes. Any idea how to list all nodes?
truelies
  • 145
  • 1
  • 2
  • 9
11
votes
1 answer

QSUB: Specify output and error files for each task in Job Array

Hopefully this is not a dublicate and also not just a problem of our cluster's configuration... I am submitting a job array to a cluster using qsub with the following command: qsub -q QUEUE -N JOBNAME -t 1:10 -e ${ERRFILE}_$SGE_TASK_ID…
niak
  • 340
  • 3
  • 11
11
votes
1 answer

How to qdel range of jobs?

I want to qdel a range of jobs, with consecutive IDs. For example: qdel 18280 18281 18282 18283 18284 18285 Imagine I had a longer list of consecutive IDs like this. I obviously don't want to have to type them all by hand. Is there a simpler way?
a06e
  • 18,594
  • 33
  • 93
  • 169
10
votes
3 answers

Variable expansion in comments

Is it possible to expand variables in comments inside a bash script? I want to write a script to feed into SGE. The qsub syntax allows me to pass additional parameters to the grid engine using lines inside the bash script which begin with #$. For…
andreas-h
  • 10,679
  • 18
  • 60
  • 78
10
votes
2 answers

what is 'Gbytes seconds'?

From the qstat (Sun Grid Engine) manpage: mem: The current accumulated memory usage of the job in Gbytes seconds. What does that mean?
Reactormonk
  • 21,472
  • 14
  • 74
  • 123
8
votes
2 answers

Getting the exit code from a process submitted with qsub on Sun Grid Engine

I would like to submit jobs via qsub on Sun Grid Engine (now: Oracle Grid Engine?). I do not wish to use the -sync yes option or qrsh, because I want my controlling program to be single-threaded and able to launch many jobs at a time. These…
Brian
  • 690
  • 1
  • 7
  • 18
8
votes
1 answer

SGE: Jobs stuck in qw state

I'm trying to submit jobs to SGE. It has been working for me the same way in the past. Now instead, all jobs are stuck in the qw state. "qstat -g c" output: > CLUSTER QUEUE CQLOAD USED AVAIL TOTAL > all.q 0.38 0 160 1920 …
quarky
  • 333
  • 1
  • 2
  • 12
6
votes
2 answers

Why are repetitive calls to squeue in Slurm frown upon?

Why is it not recommended to run squeue in a loop to avoid overloading Slurm, but no such limitations are mentioned for the bjobs tool from LSF or qstat from SGE ? The man page for squeue states: PERFORMANCE Executing squeue sends a remote…
E. Morice
  • 63
  • 3
1
2 3
22 23