1

I have a single table that contains a number of revisions for an entity in another table (Tasks). So each TaskRevision row has a unique autoincremented id and a reference to the Task id.

How do I find the latest revision of all tasks? I have tried:

session.query(TaskRevision.id).order_by(TaskRevision.task_id).filter(TaskRevision.id==func.max(TaskRevision.id)).all()

session.query(TaskRevision.id).group_by(TaskRevision.task_id, TaskRevision.id).having(TaskRevision.id==func.max(TaskRevision.id))
Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
timbo
  • 13,244
  • 8
  • 51
  • 71

1 Answers1

1

This is slightly annoying to do, but taken from this question:

subq = session.query(TaskRevision.task_id, func.max(TaskRevision.id).label("max_id")) \
              .group_by(TaskRevision.task_id).subquery()
session.query(Task) \
       .join(subq, Task.id == subq.c.task_id) \
       .join(TaskRevision, TaskRevision.id == subq.c.max_id) \
       .with_entities(Task, TaskRevision)

You can skip the second step if all you want are IDs.

Because it's a little annoying, I usually prefer to denormalize and keep a latest_task_revision_id column on Task.

Community
  • 1
  • 1
univerio
  • 19,548
  • 3
  • 66
  • 68
  • Thankyou! And so with the Task.latest_revision_id just do a join on that? – timbo Jun 16 '16 at 01:17
  • @timbo Yep, much simpler than joining to a subquery like in the former case. – univerio Jun 16 '16 at 01:17
  • I think adding a column of latest_revision_id also begs another question and that is how to maintain consistency of this value. After some thought, it would appear that creating a new revision and updating this column *must* be done within the same transaction (i.e. Session) so that no possibility exists of any inconsistency. – timbo Jun 16 '16 at 06:27
  • This answer actually doesn't work. On running all(), I get "invalid reference to FROM-clause entry for table taskrevisions". I have verified that the subquery is good but there appears to be an issue with matching to the subq.c.max_id - which is a revision id, not a task id. – timbo Jun 16 '16 at 22:56
  • @timbo I fixed it. It appears that you are using PostgreSQL, which disallows queries of the form `SELECT * FROM a, b JOIN c WHERE c.foo = a.foo and c.foo = b.foo`. Disallowing it is technically the correct behavior according to the SQL standard, but behaviors across DB engines are inconsistent. – univerio Jun 16 '16 at 23:09
  • Almost! The line `.join(TaskRevision, TaskRevision.task_id == subq.c.max_id)` should be `.join(TaskRevision, TaskRevision.id == subq.c.max_id)`. – timbo Jun 16 '16 at 23:22