I have data that looks like the following:
org_id | org_name | person_id | date |
---|---|---|---|
111 | Microsoft | 453241 | 1/1/05 |
222 | Zebra | 21341 | 6/1/95 |
333 | Company | 42343241 | 1/1/23 |
111 | Microsoft | 098678 | 2/1/13 |
111 | Microsoft Inc | 6786 | 6/1/23 |
222 | Zebra | 546 | 4/1/06 |
333 | Company | vcxv313 | 2/1/23 |
222 | NewZebra | 876 | 4/1/23 |
333 | Company | 432gf | 4/1/23 |
And I want to run Pandas functions similar to this type of SQL query:
SELECT org_id, org_name
FROM (
SELECT ROW_NUMBER() OVER(PARTITION BY org_id ORDER BY date DESC) as row_num,
org_id, org_name
FROM dataframe
)
WHERE row_num = 1
result set should be:
org_id | org_name |
---|---|
111 | Microsoft Inc |
222 | NewZebra |
333 | Company |
I'm finding myself having trouble with the Pandas groupby syntax and aggregate functions. Any help would be appreciated