I'm starting to learn how to use SQLAlchemy and I'm running into some efficiency problems.
I created an object mapping an existing big table on our Oracle database:
engine = create_engine(connectionString, echo=False)
class POI(object):
def __repr__(self):
return "{poi_id} - {title}, {city} - {uf}".format(**self.__dict__)
def loadSession():
metadata = MetaData(engine)
_poi = Table('tbl_ourpois', metadata, autoload = True)
mapper(POI, _poi)
Session = sessionmaker(bind = engine)
session = Session()
return session
This table have millions of registries. When I do a simple query and try to iterate over it:
session = loadSession()
for poi in session.query(POI):
print poi
I noticed two things: (1) it takes some minutes for it to start printing objects on the screen, (2) memory usage starts to grow like crazy. So, my conclusion was that this code was fetching all the result set in a list and then iterating over it. Is this correct?
With cx_Oracle
, when I do a query like:
conn = cx_Oracle.connect(connectionString)
cursor = conn.cursor()
cursor.execute("select * from tbl_ourpois")
for poi in cursor:
print poi
the resulting cursor behaves as an iterator that gets results into a buffer and returns them as they are needed intead of loading the whole thing in a list. This loop starts printing results almost instantly and memory usage is pretty low and constant.
Can I get this kind of behavior wiht SQLAlchemy? Is there a way to get a constant memory iterator out of session.query(POI)
instead of a list?