106

I'd like to know if it's possible to generate a SELECT COUNT(*) FROM TABLE statement in SQLAlchemy without explicitly asking for it with execute(). If I use:

session.query(table).count()

then it generates something like:

SELECT count(*) AS count_1 FROM
    (SELECT table.col1 as col1, table.col2 as col2, ... from table)

which is significantly slower in MySQL with InnoDB. I am looking for a solution that doesn't require the table to have a known primary key, as suggested in Get the number of rows in table using SQLAlchemy.

Community
  • 1
  • 1
tiho
  • 6,655
  • 3
  • 31
  • 31

10 Answers10

111

Query for just a single known column:

session.query(MyTable.col1).count()
Nathan Villaescusa
  • 17,331
  • 4
  • 53
  • 56
  • 7
    This will construct the count as a wrapped select. The query optimizer of MySQL should take care of it though. – pi. Oct 17 '12 at 20:15
  • 2
    Thanks: although it doesn't exactly answer my question, it is indeed faster. I was afraid it would ignore `NULL` entries but that's not the case. However, it remains slower than an explicit `SELECT(*)`. – tiho Oct 17 '12 at 20:15
  • @Nathan Villaescusa Can you please provide an example for multiple columns? example: `session.query(MyTable.col1, MyTable.col2).count()`. sql: `SELECT col1, COUNT(col1) FROM "table"` – Давид Шико Jan 11 '21 at 12:18
109

I managed to render the following SELECT with SQLAlchemy on both layers.

SELECT count(*) AS count_1
FROM "table"

Usage from the SQL Expression layer

from sqlalchemy import select, func, Integer, Table, Column, MetaData

metadata = MetaData()

table = Table("table", metadata,
              Column('primary_key', Integer),
              Column('other_column', Integer)  # just to illustrate
             )   

print select([func.count()]).select_from(table)

Usage from the ORM layer

You just subclass Query (you have probably anyway) and provide a specialized count() method, like this one.

from sqlalchemy.sql.expression import func

class BaseQuery(Query):
    def count_star(self):
        count_query = (self.statement.with_only_columns([func.count()])
                       .order_by(None))
        return self.session.execute(count_query).scalar()

Please note that order_by(None) resets the ordering of the query, which is irrelevant to the counting.

Using this method you can have a count(*) on any ORM Query, that will honor all the filter andjoin conditions already specified.

pi.
  • 21,112
  • 8
  • 38
  • 59
  • Thanks a lot! Seems a bit convoluted, but it works :) I'll wait a bit to see if someone comes up with a simpler syntax before accepting. – tiho Oct 17 '12 at 20:18
  • 1
    Oops, sorry I was wrong, it doesn't actually work. I mean, it does what you say, but it still doesn't yield `SELECT(*)` (which is about twice as fast as your suggestion in my real use case). – tiho Oct 19 '12 at 15:43
  • I revised my example. If you really need to "select count(*)", you can achieve that via the "text" expression. – pi. Oct 22 '12 at 08:33
  • 1
    Use sqlalchemy.sql.func.count() instead of sqlalchemy.sql.text('count(*)') – ddaa Nov 26 '12 at 10:46
  • @ddaa Thanks, I removed the call to text() completely. Still gives the right answer (and more portable too). – pi. Nov 26 '12 at 12:33
  • the new import has become to be like "from sqlalchemy.sql.functions import count" for python3 – srknzl Jan 06 '21 at 13:56
  • 4
    This is likely to _drop_ the table from the query for simple `select ... from table` queries. – Martijn Pieters Apr 17 '21 at 11:34
  • As it turns out, `self.statement.froms` is not the right answer either, see https://github.com/sqlalchemy/sqlalchemy/discussions/6807, https://github.com/sqlalchemy/sqlalchemy/issues/6808 and https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/2973 for the proposed fix. – Martijn Pieters Jul 23 '21 at 20:33
31

I needed to do a count of a very complex query with many joins. I was using the joins as filters, so I only wanted to know the count of the actual objects. count() was insufficient, but I found the answer in the docs here:

http://docs.sqlalchemy.org/en/latest/orm/tutorial.html

The code would look something like this (to count user objects):

from sqlalchemy import func

session.query(func.count(User.id)).scalar() 
boatcoder
  • 17,525
  • 18
  • 114
  • 178
melchoir55
  • 6,842
  • 7
  • 60
  • 106
  • 3
    Nowhere in your example is there any query being referenced… – Piotr Dobrogost Feb 18 '19 at 16:21
  • 1
    does anyone know how the speed of this compares with `session.query(User).count()`? it's been asked here as well: https://stackoverflow.com/questions/14754994/why-is-sqlalchemy-count-much-slower-than-the-raw-query – Matt Nov 13 '22 at 19:03
18

Addition to the Usage from the ORM layer in the accepted answer: count(*) can be done for ORM using the query.with_entities(func.count()), like this:

session.query(MyModel).with_entities(func.count()).scalar()

It can also be used in more complex cases, when we have joins and filters - the important thing here is to place with_entities after joins, otherwise SQLAlchemy could raise the Don't know how to join error.

For example:

  • we have User model (id, name) and Song model (id, title, genre)
  • we have user-song data - the UserSong model (user_id, song_id, is_liked) where user_id + song_id is a primary key)

We want to get a number of user's liked rock songs:

SELECT count(*) 
  FROM user_song
  JOIN song ON user_song.song_id = song.id 
 WHERE user_song.user_id = %(user_id)
   AND user_song.is_liked IS 1
   AND song.genre = 'rock'

This query can be generated in a following way:

user_id = 1

query = session.query(UserSong)
query = query.join(Song, Song.id == UserSong.song_id)
query = query.filter(
    and_(
        UserSong.user_id == user_id, 
        UserSong.is_liked.is_(True),
        Song.genre == 'rock'
    )
)
# Note: important to place `with_entities` after the join
query = query.with_entities(func.count())
liked_count = query.scalar()

Complete example is here.

Borys Serebrov
  • 15,636
  • 2
  • 38
  • 54
  • 3
    this can only work for queries using .join(). Otherwise the query is "optimized" and the WHERE clause is simply removed. for example `session.query(Model).with_entities(func.count()).scalar()` generates `SELECT count(*) AS count_1` which I doubt anyone is searching the result of ^^. – Romain Vincent May 30 '20 at 14:11
  • @RomainVincent for me the `session.query(MyModel).filter(MyModel.rank < 100).with_entities(func.count()) ` generates `SELECT count(*) AS count_1 FROM my_table WHERE my_table.rank < %(rank_1)s`, so you may hit some edge case or a bug. – Borys Serebrov Jun 24 '20 at 10:58
  • yes I over generalized my reply. SO a better way to say it would be "This will not work if you don't explicitly use the columns in the query() arguments. I must admit I don't know the full extent of what SQLA will do or not, but from my point of view, this method seems dangerous to use since it can lead to unexpected results. – Romain Vincent Jun 25 '20 at 13:28
  • 2
    `Query.with_entities()` essentially does the same thing as `Query.statement.with_only_columns()` (as in the accepted answer), and has the same problem in that for a plain `session.query(Model)` this produces `SELECT COUNT(*)` (no `FROM`). Until https://gerrit.sqlalchemy.org/c/sqlalchemy/sqlalchemy/+/2973 lands, there is no good way to do this at the ORM level. – Martijn Pieters Apr 17 '21 at 11:48
  • I haven't checked since when is it available, but `query.with_only_columns(func.count(), maintain_column_froms=True)` seems to work. – Gabor Simon Feb 18 '23 at 18:29
5

This is not answer to the original question posted more than 10 years ago but general answer for the problem "how to make SELECT COUNT(*)". Also note that many answers here are very outdated and suitable only for old SQLAlchemy versions.

Here is solution for SQLAlchemy 1.4.x and 2.0.x:

from sqlalchemy import func, select

class MyModel(Base):
    ...

statement = select(func.count()).select_from(MyModel)
count: int = session.execute(statement).scalar()
Konstantin Smolyanin
  • 17,579
  • 12
  • 56
  • 56
1

If you are using the SQL Expression Style approach there is another way to construct the count statement if you already have your table object.

Preparations to get the table object. There are also different ways.

import sqlalchemy

database_engine = sqlalchemy.create_engine("connection string")

# Populate existing database via reflection into sqlalchemy objects
database_metadata = sqlalchemy.MetaData()
database_metadata.reflect(bind=database_engine)

table_object = database_metadata.tables.get("table_name") # This is just for illustration how to get the table_object                    

Issuing the count query on the table_object

query = table_object.count()
# This will produce something like, where id is a primary key column in "table_name" automatically selected by sqlalchemy
# 'SELECT count(table_name.id) AS tbl_row_count FROM table_name'

count_result = database_engine.scalar(query)
Ole
  • 41
  • 3
0

I'm not clear on what you mean by "without explicitly asking for it with execute()" So this might be exactly what you are not asking for. OTOH, this might help others.

You can just run the textual SQL:

your_query="""
SELECT count(*) from table
"""
the_count = session.execute(text(your_query)).scalar()
Brian C.
  • 6,455
  • 3
  • 32
  • 42
0

query = session.query(table.column).filter().with_entities(func.count(table.column.distinct())) count = query.scalar()

this worked for me.

Gives the query: SELECT count(DISTINCT table.column) AS count_1 FROM table where ...

gladiator
  • 1
  • 1
-1

Below is the way to find the count of any query.

aliased_query = alias(query)
db.session.query(func.count('*')).select_from(aliased_query).scalar()

Here is the link to the reference document if you want to explore more options or read details.

Bilal Ahmed Yaseen
  • 2,506
  • 2
  • 23
  • 48
  • 3
    This still wraps the aliased query into a subselect. An alias produces without a name is just an anonymous subselect. The SQL your answer produces is _exactly the same as what the question asker wanted to avoid_. – Martijn Pieters Jul 23 '21 at 20:37
-2
def test_query(val: str):    
   query = f"select count(*) from table where col1='{val}'"
   rtn = database_engine.query(query)
   cnt = rtn.one().count

but you can find the way if you checked debug watch