Let's say I have an Author table and a Post table, and each Author can have several Posts.
Now, with a single sqlalchemy query, I want to get all of my active Authors and the most recent published Post for each.
I've been trying to go at this by getting a list of Posts that joinedload the Author, using a subquery to group the results together, like this:
subquery = DBSession.query(Author.id, func.max(Post.publish_date).label("publish_date")) \
.join(Post.author) \
.filter(Post.state == 'published') \
.filter(Author.state == 'active') \
.group_by(Author.id) \
.subquery()
query = DBSession.query(Post) \
.options(joinedload(Post.author)) \
.join(Post.author) \
.join(subquery, and_(Author.id == subquery.c.id,
Post.publish_date == subquery.c.publish_date))
But if I have two Posts from an Author with the same publish_date, and those are the newest Posts, that means I get that Author appearing twice in my results list. And while I could use a second subquery to eliminate dupes (take func.max(Post.id)), it seems like really, really the wrong way to go about this. Is there a better way to go about this?
(Again, I'm looking for a single query, so I'm trying to avoid querying on the Author table, then looping through and doing a Post query for every Author in my results.)