0

I have a many-to-one relationship between Team and League

The following code joins the tables, and orders the leagues by goals_for for each team in that league:

for league in League.query.join(League.teams).order_by(desc(Team.goals_for)):
    total_goals = 0
    for team in league.teams:
        total_goals += team.goals_for

    print("Total goals scored in", league.full_name, "is", total_goals)

Correctly produces:

Total goals scored in Germany Bundesliga is 22

Total goals scored in English Premier League is 15

I'm wondering about two things:

  1. Given that teams is basically a list, and therefore has no total_goals as that belongs to each team instance, is there a way to sum the value of team goals without the for loop?

  2. If 1 is possible, is it faster/better than the for loop above?

zerohedge
  • 3,185
  • 4
  • 28
  • 63
  • Have you tried `sum(league.teams)`? Should be faster* since `sum` is implemented in C. *YMMV – sytech Dec 14 '16 at 22:13
  • But I need each `team`'s `goals_for` column, not the number of teams. – zerohedge Dec 14 '16 at 22:14
  • 1
    Sorry, misread what was going on. `sum(team.goals_for for team in league.teams)` would be the rough equivalent, but as a generator expression and using `sum`. – sytech Dec 14 '16 at 22:17
  • Isn't that what's called a comprehension, and is just a faster for loop? (speaking very noob-like?) – zerohedge Dec 14 '16 at 22:18
  • 1
    Comprehensions are not the same as generator expressions. See [this question](http://stackoverflow.com/questions/47789/generator-expressions-vs-list-comprehension) for some more detail on that. They are usually faster, because they do not keep all the items in memory. So if you only use the values once, a generator expression is probably optimal. Might be good to give [PEP0289](https://www.python.org/dev/peps/pep-0289/) a read. – sytech Dec 14 '16 at 22:23
  • In your example the `order_by(...)` is redundant as it has no impact on final result. – van Dec 14 '16 at 22:57

1 Answers1

3

Why not try to get aggregate result directly from the SQL query, and avoid retrieving extra data from the database and loading whole relationship tree.

Following should given you an idea:

from sqlalchemy import func

q = (
    session
    .query(League.full_name, func.sum(Team.goals_for).label("total_goals"))
    .join(Team, League.teams)
    .group_by(League.full_name)
)

for full_name, total_goals in q:
    print("Total goals scored in", full_name, "is", total_goals)
van
  • 74,297
  • 13
  • 168
  • 171
  • I upvoted and selected this because I see your name frequently in SQLAlchemy threads, but I have to admit I don't quite get how are we avoiding the whole relationship tree here. I understand that there's no join, but then how are getting the Team? Where does the "good economics" of such query come from? – zerohedge Dec 18 '16 at 23:12
  • Good economics come from the fact that this is just one single `SQL` query to the database. – van Dec 19 '16 at 10:33
  • And actually, the join is required here, so this is clearly an omission from my end. I will amend. – van Dec 19 '16 at 10:33
  • Thank you. I still don't understand **where** is the additional query in my original post. (also: no way to get in touch with you via your profile, I'd love to hire you for a project) – zerohedge Dec 19 '16 at 10:36
  • Depending on how the relationship `League.teams` is configured (is it using `joinedload`, for example), your first query will *not* load the `Team` object. Only when you access it via `league.teams` (line 3 of your code), another `SQL` query will be issued. You can check that by logging your `SQL` statements (`engine.echo = True`). But even if your relationships are loaded automatically, I would argues that the calculation of one total per league is much faster on the database then it is in the python code. Plus the data transfer between DB and code is smaller. – van Dec 19 '16 at 10:44
  • *Thank you, but I really have no time for any projects. Happy to help you with directions though*. – van Dec 19 '16 at 10:45
  • I would love a few consultation sessions is what I meant. I intend to build it on my own but directions are what I need. – zerohedge Dec 19 '16 at 10:47
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/130954/discussion-between-van-and-zerohedge). – van Dec 19 '16 at 13:04