2

I have looked at the documentation on both, but am not sure what's the best choice for a given application. I have looked closer at celery, so the example will be given in those terms.

My use case is similar to this question, with each worker loading a large file remotely (one file per machine), however I also need workers to contain persistent objects. So, if a worker completes a task and returns a result, then is called again, I need to use a previously created variable for the new task.

Repeating the object creation at each task call is far too wasteful. I haven't seen a celery example to lead me to believe this is possible, I was hoping to use the worker_init signal to accomplish this.

Finally, I need a central hub to keep track of what all the workers are doing. This seems to imply a client-server architecture rather than the one provided by Celery, is this correct? If so, would IPython Parallel be a good choice given the requirements?

Community
  • 1
  • 1
phil0stine
  • 303
  • 1
  • 13

1 Answers1

0

I'm currently evaluating Celery vs IPython parallel as well. Regarding a central hub to keep track of what the workers are doing, have you checked out the Celery Flower project here? It provides a webpage that allows you to view the status of all tasks in the queue.

Thirumalai murugan
  • 5,698
  • 8
  • 32
  • 54
Brad
  • 117
  • 9
  • I had not gone into detail into Flower, though just to update I did choose IPython Parallel for my application. I needed to be able to create an object on a remote machine, and repeatedly run a task using the object. I did not find a way to have remote object persistence in Celery. – phil0stine Jul 03 '13 at 17:20
  • As it turns out, object persistence in Celery is possible and not too complex. Here's an [example](http://stackoverflow.com/questions/17035087/celery-single-task-persistent-data/19721490?noredirect=1#19721490) – phil0stine Nov 05 '13 at 00:56