0

As many places on the internet say, you have to make sure that if you pass around primary keys from django model instances, the task you're passing them to only gets started after the transaction in the current context has completed. You can do that with django's transaction.on_commit. Otherwise you'll get an ObjectDoesNotExist error in the called task every so often.

I've been developing some mildly complex celery workflows using its canvas primitives (chain, chord, group, ...) where the intermediate task calling (say, the next task in a chain) is done by celery itself and I've been running in those kinds of errors, as one of the flows involves passing around such a primary key to the next task in the chain. I can't find any way of making sure the next task is called only when the transaction in the current task finishes.

I've tried to turn autocommit off and handle commits and rollbacks myself like:


transaction.set_autocommit(False)
try:

    stuff_to_get = Model.objects.first()

    try:
        data = very_long_running_query()
    except ConnectionReset as exc:
        self.retry(exc=exc)  

    obj = Foo.objects.create(**data)  # this is then referred to in a task further in the chain
    return obj.pk
except Exception as e: 
    transaction.rollback()
    raise e
else:
    transaction.commit()
finally:
    transaction.set_autocommit(True)


But this results in strange django.db.utils.ProgrammingError: set_session cannot be used inside a transaction errors from postgres on the first set_autocommit call.

I'm not entirely sure of what I'm doing with transactions here, i'm very aware, so the last error might be trivial to fix, but I'm not sure what it means. Is postgres saying i'm already inside a transaction? Is this strategy even a good idea to begin with (is there any harm in very long running transactions? the query can last up to 50 seconds). Or... is there another way of guaranteeing the transaction finishes before anything else is called?

(PS: It's not that I'm using the database to pass results from tasks in celery to the next, I'm storing an object in one task and then running a very long running query in the next one to fetch some stuff that has to be stored with a foreign key to that object)

thepandaatemyface
  • 5,034
  • 6
  • 25
  • 30
  • What kind of database? Is there any reason we're not just using a `with transaction.autocommit` block? – 2ps Dec 27 '19 at 17:41
  • Postgresql. There doesn't seem to be a `transaction.autocommit`? I'm on django 2.2 – thepandaatemyface Dec 27 '19 at 18:53
  • My bad, meant [`transaction.atomic`](https://docs.djangoproject.com/en/2.2/topics/db/transactions/#django.db.transaction.atomic) – 2ps Dec 27 '19 at 21:04
  • You can read the section there on using it as a context manager. That should allow you to close off the transaction before kicking-off the next celery task. – 2ps Dec 27 '19 at 21:05

0 Answers0