I've read all the documentation on the subject, but it seems I can't grasp the whole concept of Python coroutines well enough to implement what I want to do.
I have a background task (which generates some random files, but that doesn't much matter), and it does this in an infinite loop (this is a watcher).
I would like to implement this background task in the most efficient way possible, and I thought that microthreads (aka coroutines) were a good way to achieve that, but I can't get it to work at all (either it the background task runs or either the rest of the program, but not both at the same time!).
Could someone give me a simple example of a background task implemented using coroutines? Or am I being mistaken in thinking that coroutines could be used for that purpose?
I am using Python 2.7 native coroutines.
I am well versed into concurrency, particularly with DBMSes and Ada, so I know a lot about the underlying principles, but I'm not used to the generator-as-coroutines concept which is very new to me.
/EDIT: here is a sample of my code, which I must emphasize again is not working:
@coroutine
def someroutine():
with open('test.txt', 'a') as f:
f.write('A')
while True:
pass
yield 0
@coroutine
def spawnCoroutine():
result = yield someroutine()
yield result
routine = spawnCoroutine()
print 'I am working in parallel!'
# Save 'A' in the file test.txt, but does not output 'I am working in parallel!'
Note: @coroutine is a decorator from coroutine.py provided by David Beazley
/FINAL EDIT AND SOLUTION RECAP
Ok my question was closed because it was seemingly ambiguous, which as a matter of fact is the very purpose of my question: to clarify the usage of Coroutines over Threading and Multiprocessing.
Luckily, a nice answer was submitted before the dreadly sanction occurred!
To emphasize the answer to the above question: no, Python's coroutines (nor bluelet/greenlet) can't be used to run an independent, potentially infinite CPU-bound task, because there is no parallelism with coroutines.
This is what confused me the most. Indeed, parallelism is a subset of concurrency, and thus it is rather confusing that the current implementation of coroutines in Python allow for concurrent tasks, but not for parallel tasks! This behaviour is to be clearly differentiated with the Tasks concept of concurrent programming languages such as Ada.
Also, Python's Threads are similar to coroutines in the fact that they generally switch context when waiting for I/O, and thus are also not a good candidate for independent CPU-bound tasks (see David Beazley's Understanding the GIL).
The solution I'm currently using is to spawn subprocesses with the multiprocessing
module. Spawning background processes is heavy, but it's better than running nothing at all. This also has the advantage to allow for distributing computation.
As an alternative, on Google App Engine, there are the deferred module and the background_thread module which can offer interesting alternatives to multiprocessing (for example by using some of the libraries that implement the Google App Engine API like typhoonae, although I'm not sure they have yet implemented these modules).