I think you could use a column_property
to load the latest value as an attribute of an Entities
instance along other column-mapped attributes:
from sqlalchemy import select
from sqlalchemy.orm import column_property
class Entities(Base):
...
value = column_property(
select([EntityHistory.value]).
where(EntityHistory.entity_id == id). # the id column from before
order_by(EntityHistory.timestamp.desc()).
limit(1).
correlate_except(EntityHistory)
)
A subquery could of course also be used in a query instead of a column_property
.
query = session.query(
Entities,
session.query(EntityHistory.value).
filter(EntityHistory.entity_id == Entities.id).
order_by(EntityHistory.timestamp.desc()).
limit(1).
label('value')
)
Performance would naturally depend on proper index being in place:
Index('entityhistory_entity_id_timestamp_idx',
EntityHistory.entity_id,
EntityHistory.timestamp.desc())
In a way this is still your dreaded N+1, as the query uses a subquery per row, but it's hidden in a single round trip to the DB.
If on the other hand having value as a property of Entities
is not necessary, in Postgresql you can join with a DISTINCT ON ... ORDER BY query to fetch latest values:
values = session.query(EntityHistory.entity_id,
EntityHistory.value).\
distinct(EntityHistory.entity_id).\
# The same index from before speeds this up.
# Remember nullslast(), if timestamp can be NULL.
order_by(EntityHistory.entity_id, EntityHistory.timestamp.desc()).\
subquery()
query = session.query(Entities, values.c.value).\
join(values, values.c.entity_id == Entities.id)
though in limited testing with dummy data the subquery-as-output-column always beat the join by a notable margin, if every entity had values. On the other hand if there were millions of entities and a lot of missing history values, then a LEFT JOIN was faster. I'd recommend testing on your own data which query suits your data better. For random access of single entity given that the index is in place a correlated subquery is faster. For bulk fetches: test.