6

I am using sqlalchemy + asyncpg, and 'selectin' eager loading.

I have Person items that have one-to-many relationships with Friends.

I insert a Person into my database, with no related Friend entries. If in the same session I try and get that Person from the database, I can access their static (non-relationship) columns fine, but cannot access the friends relationship.

I think trying to access person.friends is triggering a lazy load, despite it being enforced previously as a selectin load. Why is this? How can I avoid it?

# Create the ORM model
class Person(Base):
    __tablename__ = 'items'
    id_ = Column(POSTGRES_UUID(as_uuid=True), primary_key=True)
    name = Column(String(32))
    friends = relationship('Friend', lazy='selectin')

# Create an instance
person_id = uuid4()
person = Person(id_=person_id, name='Alice') # Note that this Person's friends are not set

# Add to database
async with AsyncSession(engine, expire_on_commit=False) as session:
    try:
        session.begin()
        session.add(person)
        await session.commit()
    except:
        await session.rollback()
        raise
    # Get the added person from the database
    created_person = await session.get(person, person_id)
    print(created_person.id_) # Works fine
    print(created_person.friends) # Raises error

Error:

sqlalchemy.exc.MissingGreenlet: greenlet_spawn has not been called; can't call await_() here.
Was IO attempted in an unexpected place? (Background on this error at: https://sqlalche.me/e/14/xd2s)
jarlh
  • 42,561
  • 8
  • 45
  • 63
theo-brown
  • 653
  • 10
  • 27

2 Answers2

4

The solution is to use the populate_existing parameter in get:

populate_existing – causes the method to unconditionally emit a SQL query and refresh the object with the newly loaded data, regardless of whether or not the object is already present.

Replace

created_person = await session.get(person, person_id)

with

created_person = await session.get(person, person_id, populate_existing=True)

session.get documentation

See also: https://github.com/sqlalchemy/sqlalchemy/issues/7176

theo-brown
  • 653
  • 10
  • 27
2

@theo-brown's answers goes straight to the point, but wanted to add some interesting information here.

Adding extra context on lazy loading and async SQLAlchemy:

When you fetch data using async SqlAlchemy, every model being queried spawns a coroutine. If you don't eager load your relationships, you'll end up with partially populated models.

Imagine this use case that I'm working on: I have a batch_job object, that relates to a batch_file and batch_job entries, all of which are tables in my database. When I don't eager load them, see what happens in the debugger:

Debugger Traceback for relationships

The Traceback that I get when returning the object from an endpoint is this one:

greenlet_spawn has not been called; can't call await_only() here. Was IO attempted in an unexpected place? (Background on this error at: https://sqlalche.me/e/14/xd2s)

The reason is that I didn't await these values, and that's what eager loading does for you in async sqlalchemy.

However, you might not have to eager load if you're working inside the application scope and you'll want to use these values later, and hence you could await them.

For those who are using the ORM, you could do it with the good old loading options:

results = await db_session.execute(select(YourModel).options(joinedload(YourModel.relationshipcolumn)).all()
Tiago Duque
  • 1,956
  • 1
  • 12
  • 31