Using SQLAlchemy objects in a long running Celery task

Question

I have a long running task that updates some SQLAlchemy objects. A session is opened at the start of the task, updates are made along the way, and the transaction is committed at the end. The problem is that the task runs very long, so the connection will have closed (timed out, "gone away", whatever you want to call it) before the commit is even able to happen. This will cause the commit to fail and the whole task to fail.

This seems absolutely the correct way to do write to a DB to short tasks or non-Celery related things. But it is certainly a problem if the tasks take too long.

Is there some other recommended pattern? Should the Celery task not even utilize the SQLAlchemy objects and instead use some sort of static class whose data can be used to update the actual SQLAlchemy objects, only at the end of the task, maybe? That is the only possible solution I have come up with. I would like to know if there are others or if my idea has other problematic implications.

Can you be more specific about "timed out" connections? I've had celery tasks using sqlalchemy quite heavily that run for over 24hrs before and have not had any such issues. I would certainly not keep a transaction open that long, though. You should be doing small commits along the way or you're probably going to have bigger issues with your database locking records and the like — Stephen Fuhry, Feb 12 '20 at 02:44
@StephenFuhry The error is that the "MySQL server has gone away". The connection drops. This is precisely the problem, I believe, that I am leaving the transaction open too long. But I feel I have to, because I edit many DB objects in this period of time. The Celery task uses the SQLAlchemy objects. If I shouldn't leave a transaction open for long, what should I do? — Zeke Marffy, Feb 13 '20 at 13:55
I would start by looking for ways to break apart the work into smaller chunks that don't require such a long transaction if at all possible. Otherwise, this may be a mysql configuration issue. Look into max_allowed_packet or other similar settings.. https://stackoverflow.com/questions/7942154/mysql-error-2006-mysql-server-has-gone-away — Stephen Fuhry, Feb 14 '20 at 13:42
Definitely not a config issue. I understand your point about breaking it up. I guess the answer is to just open transactions only when the writes need to occur. — Zeke Marffy, Feb 16 '20 at 17:41

score 0 · Accepted Answer · answered Oct 17 '20 at 15:42

0

Open the transaction only when it must occur. Therefore, the session should be closed while the actual task is running, but the objects should be kept around and edited by the task. Then use merge. Make sure the setting expire_on_commit is False.

answered Oct 17 '20 at 15:42

Zeke Marffy

188
3
14

Using SQLAlchemy objects in a long running Celery task

1 Answers1