0

I would like to extract the first entry in each group using the SqlAlchemy ORM, but struggle with the construct of this:

Example: Extract the first ever ordered item of each customer:

customer | date     | item
------------------------
Joe      | 20190101 | duck
Jane     | 20190101 | rabbit
Joe      | 20190202 | table
Bob      | 20190301 | guitar
Jane     | 20190301 | guitar

expected output

customer | date     | item
------------------------
Joe      | 20190101 | duck
Jane     | 20190101 | rabbit
Bob      | 20190301 | guitar

The problem is solved easily enough for pandas

df.sort_values(['date'],ascending=True).groupby('customer').first().reset_index()
# -- or --
df.groupby(["customer"]).apply(lambda x: x.sort_values(["date"], ascending = True)).first().reset_index()

but i cannot find out how to achieve similar for SqlAlchemy (or for SQL for what it matters, as I struggle to understand the syntax and use of these Partition-Over functions).

Thanks for any help

Ilja Everilä
  • 50,538
  • 7
  • 126
  • 127
mzoll
  • 475
  • 3
  • 11
  • 1
    Which database? – Ilja Everilä Nov 08 '19 at 16:21
  • Does it really matter when using sqlalchemy? But, it's postgres-10. – mzoll Nov 11 '19 at 13:59
  • OKay, at least for the SQL part i found out the wanted query: '''SELECT customer, date, first_value(item) OVER (PARTITION BY customer ORDER BY date ASC as first_item FROM table0;''', so now how to put this into sqlalchemy code – mzoll Nov 11 '19 at 14:37
  • It does; for example with Postgresql you can use the `DISTINCT ON ... ORDER BY` idiom: https://stackoverflow.com/questions/44069023/sqlalchemy-show-only-latest-result-if-a-join-returns-multiple-results, and if you want to use window functions, see for example https://stackoverflow.com/questions/40635099/convert-rank-and-partition-query-to-sqlalchemy – Ilja Everilä Nov 11 '19 at 18:35

0 Answers0