Pandas dataframe: GroupBy -> Records

Question

I have a pandas dataframe filled with accounting records that I would like to groupby -> records, but I can't quite figure out which methods to use.

Here is the original dataframe:

person_id|statement_date|expense_type|amount
--------------------------------------------
1        |1/1           |coffee      |5.00
1        |1/1           |bagel       |3.00
1        |1/1           |ticket      |8.00
1        |2/1           |coffee      |5.00
1        |2/1           |donut       |4.00
1        |2/1           |ticket      |8.00
2        |1/1           |coffee      |5.00
2        |1/1           |bagel       |3.00
2        |1/1           |ticket      |8.00
2        |2/1           |coffee      |5.00
2        |2/1           |donut       |4.00
2        |2/1           |ticket      |8.00

Instead of each row containing a single expense line item, I'd like to position expenses as columns, and each record being a person_id + statment_date, like so:

person_id|statement_date|coffee|bagel|ticket|donut
--------------------------------------------------
1        |1/1           |5.00  |3.00 |8.00  |na
1        |2/1           |5.00  |na   |8.00  |3.00
2        |1/1           |5.00  |3.00 |8.00  |na
2        |2/1           |5.00  |na   |8.00  |3.00

The following groupby gets close:

df.groupby(['person_id', 'statement_date','expense_type']).sum()

But I would like the last item, 'expense_type' to become cols.

I'm sure there is some sort of apply that I could use, but I just wasn't able to figure this one out.

Thank you!

What you are looking for is a pivot table. Learn more here: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.pivot_table.html - I will try to provide a solution in the mean time. — Daniel, Feb 07 '20 at 17:48
`df.pivot_table(index=['person_id','statement_date'],columns='expense_type',values='amount')` ? — anky, Feb 07 '20 at 17:49
Converting to pivot table, and then using `.to_records()` and back into dataframe is exactly what I needed. Thank you both! — ZAR, Feb 07 '20 at 17:55

Pandas dataframe: GroupBy -> Records

0 Answers0