Can I run a query within a query using SQLAlchemy?

Question

Can the following MySQL query be done with a single SQLAlchemy session.query or do I have to run a second session.query ? If so, how so?

Select *, (select c from table2 where id = table1.id) as d from table1 where foo = x

daveruinseverything · Accepted Answer · 2017-07-29T05:01:37.867

What you want is SQLAlchemy's subquery object. Essentially, you write a query as normal, but instead of ending the query with .all() or .first() (as you would normally do to return some kind of result directly), you end your query with .subquery() to return a subquery object. The subquery object basically generates the subquery SQL embedded within an alias, but doesn't run it. You can then use it in your primary query, and SQLAlchemy will issue the necessary SQL to perform the query and subquery in a single operation.

Let's say we had the following student_scores table:

+------------+-------+-----+
|    name    | score | age |
+------------+-------+-----+
| Xu Feng    |   95  |  25 |
| John Smith |   88  |  26 |
| Sarah Taft |   89  |  25 |
| Ahmed Zaki |   86  |  26 |
+------------+-------+-----|

(Ignore the horrible database design)

In this example, we want to get a result set containing all the students and their scores, joined to the average score by age. In raw SQL we would do something like this:

  SELECT ss.name, ss.age, ss.score, sub.average
    FROM student_scores AS "ss"
    JOIN ( SELECT age, AVG(score) AS "average"
             FROM student_scores
         GROUP BY age) AS "sub"
      ON ss.age = sub.age
ORDER BY ss.score DESC

The result should be something like this:

+------------+-------+-----+---------+
|    name    | score | age | average |
+------------+-------+-----+---------+
| Xu Feng    |   95  |  25 |    92   |
| John Smith |   88  |  26 |    87   |
| Sarah Taft |   89  |  25 |    92   |
| Ahmed Zaki |   86  |  26 |    87   |
+------------+-------+-----|---------+

In SQLAlchemy, we can first define the subquery on its own:

from sqlalchemy.sql import func

avg_scores = (
    session.query(
        func.avg(StudentScores.score).label('average'),
        StudentScores.age
    )
    .group_by(StudentScores.age)
    .subquery()
)

Now our subquery is defined, but no statements have actually been sent to the database. Nevertheless we can treat our subquery object almost as though it were just another table, and write our main query:

results = (
    session.query(StudentScores, avg_scores)
    .join(avg_scores, StudentScores.age == avg_scores.c.age)
    .order_by('score DESC').all()
)

Only now is any SQL issued to the database, and we get the same results as the raw subquery example.

Having said that, the example you provided is actually pretty trivial and shouldn't require a subquery at all. Depending on how your relationships are defined, SQLAlchemy can eagerly load related objects, so that the object returned by:

results = session.query(Table1).filter(Table1.foo == 'x').all()

will have access to the child (or parent) record(s) from Table2, even though we didn't ask for it here - because the relationship defined directly in the models is handling that for us. Check out "Relationship Loading Techniques" in the SQLAlchemy docs for more information on how this works.

What OP was after looks like a scalar subquery, which [`Query.label()`](http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.label) and [`Query.as_scalar()`](http://docs.sqlalchemy.org/en/latest/orm/query.html#sqlalchemy.orm.query.Query.as_scalar) handle explicitly. — Ilja Everilä, May 07 '17 at 11:37

Can I run a query within a query using SQLAlchemy?

1 Answers1