-2

I have a Flask web application with a Postgres database (<10 million rows). I used SQLAlchemy to connect with Postgres, but queries that I have written in native SQLAlchemy are compiled to SQL that is too slow. The .count() method is one of the main offenders here. I am planning to rewrite my queries. Pseudocode examples:

From db.session.query(Table).filter(Table.column==condition).count()

To db.session.execute(sqlalchemy.text("SELECT count(Table.id) from Table WHERE Table.column=condition")

My question is: can I do better than the execute-text construct? Will this still get wrapped in slow SQLAlchemy logic? Or is this as close to running raw SQL as it gets? How much faster can I tune my Flask-Postgres interaction? (I'm not interested in answers that involve additional third-party services.)

Newb
  • 2,810
  • 3
  • 21
  • 35
  • every orm makes pure sql in the end as the server doesn't understand anything else. And counts are almost always slow – nbk Jun 18 '23 at 18:33
  • @nbk you misread my question. I'm asking whether me rewriting my ORM queries to pure SQL and then using sqlalchemy to execute pure SQL is good enough. I'm asking whether pure SQL, if executed through sqlalchemy, is still in some way suboptimal. – Newb Jun 18 '23 at 18:41
  • no, it will be the same outcome, i thought you wanted to speeed up the count by doing what ever which makes no sense, as the outcome is the same, you need ways to speed up count, and there are not that many possibiities, and all discussed here already – nbk Jun 18 '23 at 18:44
  • @nbk I think you're misunderstanding this. SQLAlchemy generates inefficient SQL queries. Especially for .count(), this is well-known behavior. I can manually write much more efficient SQL queries. A hand-written count query is about 5-8x faster than the query that SQLAlchemy generates. My question is about how SQLAlchemy executes my manually-written pure SQL query. – Newb Jun 18 '23 at 18:55
  • you can look at the raw sql of the query, sql alchemy doesn't make ineffecient queries, as i stated already with my first comment the query is ok the count is slow on postgres side – nbk Jun 18 '23 at 19:00
  • SQLAlchemy doesn't execute raw SQL. It simply passes it to the underlying driver which sends it to the database to be executed. SQLAlchemy is just a tool for *generating* SQL, either via the ORM or a DS: both provide a consistent interface that is (mostly) independent of the particular dialect of SQL you need to target. – chepner Jun 18 '23 at 20:37

1 Answers1

0

I think in that case you might be better off using func.count like this: db.session.query(func.count(OrmClass.id)).filter(OrmClass.column==condition).scalar().

The .query() interface has a lot of quirks because it was kept mostly backwards compatible. I didn't even know it generates a subquery to perform count. I'm sure there was/is a reason.

If you are using 1.4 with future=True or using 2+ then you can start using select() which I find way more consistent for everything.

You could write it with select() and func.count() like this, which is closer to actual SQL:

from sqlalchemy.sql import select, func

count = db.session.execute(select(func.count(OrmClass.id)).where(OrmClass.column==condition)).scalar()
Ian Wilson
  • 6,223
  • 1
  • 16
  • 24
  • This is useful information, but it doesn't answer my question. I plan to drop straight down to raw SQL and run it through `.execute(text(`. My question is whether there are any performance related pitfalls in that. – Newb Jun 18 '23 at 20:52
  • 1
    I'm not sure if you could measure it to any sort of significant digits but less layers will always be faster with the expected trade offs. For example you could even use the db driver directly and not even use sqlalchemy. Maybe write a prototype script and put `echo=True` on your engine, to make sure the queries are accurate and run some speed tests (turn echo off for the actual tests though). Then you can determine what difference is "worth" it depending on your performance issues. – Ian Wilson Jun 18 '23 at 21:24