2

Question is relevant to this and this;

the difference is, I'd prefer something with possibly more precision and low load (per-minute cron job isn't preferable for those) and with minimal overhead (i.e. installing celery with rabbitmq seems like a big overkill).

An example task for such is personal reminders server (with reminders that could be edited over web and sent out through e-mail or XMPP).

I'm probably looking for something more like node.js's setTimeout but for django (and though I might prefer to implement reminders in node.js anyway, it's still a possibly interesting question).

For example, it's possible to start new threads in django app (with functions consisting of sleep() and send()); in what ways this can be bad?

Community
  • 1
  • 1
HoverHell
  • 4,739
  • 3
  • 21
  • 23
  • 1
    I wonder whether your my clone (or vice versa) that's working on exactly the same project I am. If so, we may not survive if we ever meet. The universe seems too small for us. – MK_Dev Feb 06 '12 at 22:39
  • You don't have to use RabbitMQ with Celery: http://docs.celeryproject.org/en/latest/getting-started/brokers/index.html (there is also Amazon SQS which is not on this list, but that would probably come with more latency than cron's one minute precision ;) – asksol Feb 07 '12 at 10:30
  • @MK_Dev, I'm actually not yet working on any such project, just planning to maybe-get-to-it. Though if you're doing something similar or relevant, it might be interesting to join in a bit.. – HoverHell Feb 07 '12 at 14:32
  • @asksol yes, I've noticed that; but how exactly would timed messages work in those cases (e.g. with django db as broker)? (though I don't really know how exactly they work with rabbitmq either). – HoverHell Feb 07 '12 at 14:33
  • @HoverHell scheduled/timed tasks are not implemented in the broker, they are implemented by the worker (and celerybeat). – asksol Feb 07 '12 at 16:04

1 Answers1

0

The problem with using threads for this solution are the typical problems with Python threads that always drive people towards multi-process solutions instead. The problem is compounded here by the fact your thread isn't driven by the normal request-response cycle. This is summarized nicely by Malcolm Tredinnick here:

Have to disagree. Threads are not a good solution to this problem. The issue is process management. As written, your threads will never be rejoined. Webserver processes have a lifecycle uncontrollable by you (the MaxRequestsPerChild Apache parameter and similar things in other servers) and you are messing with that by using threads.

If you need a process with a lifecycle that is not matched by the request-response path — something long running and independent of the response — a completely separate process is definitely the right model to use. Using a thread is tying it to the response lifecycle, which wil have unintended side-effects.

A possible solution for you might be to have a long running process performing your tasks which gets a wake-up signal from a light cron process.

Another possibility would be build something using 0mq, which is much lighter than AMQP style queues (at the cost of some features of course). Tarek Ziade is working on a Mozilla project called powerhose that uses 0mq, looks super simple, and has a heartbeat capability with resolution to the second.

Van Gale
  • 43,536
  • 9
  • 71
  • 81
  • Here is Tarek's reasoning for building this library: http://tarekziade.wordpress.com/2012/02/06/scaling-crypto-work-in-python/ – Van Gale Feb 07 '12 at 06:18
  • Aha. Regarding that comment, I understand the problem of lifecycle management well enough (also, as far as I understand, if only django's internal web server is used that won't be too much of a problem). But what exactly goes on with temporary threads and can't it be fixed? – HoverHell Feb 07 '12 at 14:41