It seems that Celery (v4.1) can be either used with some prefetching of tasks, or with CELERY_ACKS_LATE=True
(Discussed here)
We currently work with CELERY_ACKS_LATE=False
and CELERYD_PREFETCH_MULTIPLIER=1
In both cases, there are unacknowledged messages in Rabbit.
At times we suffer from network issues that cause Celery to lose the connection to Rabbit for few seconds, getting these warnings: consumer: Connection to broker lost. Trying to re-establish the connection...
When this happens, the unacknowledged messages, turn back to Ready
, what seems to be the standard behaviour, and are being consumed by another consumer.
This causes a multiple execution of the tasks, as the consumer started a prefetched task in the worker process, but couldn't ack it to Rabbit.
As it seems that its impossible to guarantee that tasks will get executed exactly once in Celery without external tools, how is it possible to ensure that tasks are executed at most once?
----- Edit ----
One approach I'm considering is to use the task's self.request.delivery_info['redelivered']
and fail tasks that were redelivered
.
While achieving the goal of "executing at most once" this will have a high rate of false positives (tasks that weren't already executed)