1

I have a csv that stores information about a particular object and Date.

Device  Date        Category    Amount
Pen     01/01/2014  A           12
Pen     01/01/2014  B           42
Pen     01/01/2014  C           10
Pen     01/01/2014  D           5
Pen     02/01/2014  A           7
Pen     02/01/2014  B           52
Pen     02/01/2014  C           1
Pen     02/01/2014  D           7
Pencil  01/01/2014  A           22
Pencil  01/01/2014  B           42
Pencil  01/01/2014  C           70
Pencil  01/01/2014  D           8

I want to read it into a DataFrame and make Category a column and Amounts for a specific device and date a row. This will make the dataset much smaller.

Device  Date        A   B   C   D
Pen    01/01/2014   12  42  10  5
Pen    02/01/2014   7   52  1   7
Pencil  01/01/2014  22  42  70  8
Bryce Ramgovind
  • 3,127
  • 10
  • 41
  • 72
  • 1
    Possible duplicate of [python, pandas, dataframe, rows to columns](http://stackoverflow.com/questions/39492251/python-pandas-dataframe-rows-to-columns) – Mohammad Athar Sep 15 '16 at 14:19
  • 1
    Possible duplicate of [Pandas transposition inside dataframe](http://stackoverflow.com/questions/24116600/pandas-transposition-inside-dataframe) – A.Kot Sep 15 '16 at 14:21

2 Answers2

5

You can use pivot_table where columns that you want to keep are set as index, columns that go to the header are set as columns and columns that fill the cells in the output data frame are set as values:

df.pivot_table(index=['Device', 'Date'], columns='Category', values='Amount').reset_index()

# Category  Device        Date     A     B   C  D
#        0     Pen  01/01/2014    12    42  10  5
#        1     Pen  02/01/2014     7    52   1  7
#        2  Pencil  01/01/2014    22    42  70  8
Psidom
  • 209,562
  • 33
  • 339
  • 356
2

using groupby and unstack

df.groupby(['Device', 'Date', 'Category'])['Amount'].sum().unstack().reset_index()

enter image description here

piRSquared
  • 285,575
  • 57
  • 475
  • 624