I am working on a Online Judge code checker.My code uses multi-threading in python 2.7.The same program on my local machine (i core 3 RAM 4GB) evaluates about 1000 submisions in 1 minute 10 seconds. But when I run it on ec2 micro instance(about 600 MB RAM) it takes about 40 minutes(It gets slow for some random seconds).To know the reason I broke down things.
First this is how my evaluator works:
- I have a main program
worker.py
, which creates multiple threads - The main thread pulls submissions(10 at a time) from a file(for time being) and puts them in a global queue
- The side threads take submisions from queue(one submission evaluated solely by one thread)
- After a side thread takes a submission it sends it to a function
compile
,which returns the executable of submission back to that thread - Then the thread sends this executable to a function
run
which runs the executable (using sandbox with defined memory and time limits) and writes the output of the executable to file and then checks it
against standard output - After the queue gets empty the main thread again pulls 10 submissions and places them in queue
- I have a main program
The functions
compile
andrun
:- The compile function and run function save the executable and output in files(repectively) named
like
<thread_Name>.exe
and<thread_Name>.txt
so that every thread has its own files and there is no issue of overwriting. - A thread goes to run function only if status from compile function was OK(the file compiled)otherwise throws compile error for
that submission
- The compile function and run function save the executable and output in files(repectively) named
like
Now the doubts I have:
- Is the problem of slow execution on ec2 due to the resources it has or due
to multi-threading of python.In my scripts the threads to access global variables
such as the queue(i put locks) and
test.py(I dont put lock on it)
which in run function checks the output with standard output character by character(vimdiff like), andmysandbox.py(libsandbox the sandbox)
and some other global variables.So is the slow working due to GIL of python.If it is so then why does it work fast on my local machine. - Also for time being I give the same file
test.cpp(adds two numbers and prints result)
1000 times.So when I purposely make a compile error in this file and run my main program on ec2 it runs pretty fast.From that I deduced that the compiling and and running(compile and run functions) of my program take the main time,not the thread creation and management.
- Is the problem of slow execution on ec2 due to the resources it has or due
to multi-threading of python.In my scripts the threads to access global variables
such as the queue(i put locks) and
I know its a vast question but any help is really appreciated(or i will have to keep bounty on it betting all my reputation :) ).