0

I have a df structured in following setting and would like to change it so that the types found in the column measure are the the row readers with the original result as the row the new type column, condensing ids of the same date into one row. For example, I would like to change the following table:

id name measure result date
1 A O1 X 2015
1 A O2 X 2015
1 A O3 X 2015
2 B O2 Y 2015
1 A O1 Z 2016
2 B O1 Z 2016
... ... ... ... ...

To:

id name O1 O2 O3 date
1 A X X X 2015
2 B None Y None 2015
1 A Z None None 2016
2 B Z None None 2016
... ... ... ... ... ...

I know to use the pivot_table function in pandas; however, I am unsure how to take into account different years. Here are similar links to my question but don't answer the same question: How to make types in the rows of pandas dataframe to become the column header with result as row type? How to pivot a dataframe in Pandas? How can I pivot a dataframe?

aj3409
  • 186
  • 2
  • 14

1 Answers1

1

We can use aggfunc='first' for string values. rename_axis and reset_index to cleanup format:

new_df = (
    df.pivot_table(index=['date', 'id', 'name'],
                   columns='measure',
                   values='result',
                   aggfunc='first')
        .rename_axis(columns=None)
        .reset_index()
)
# Re-order columns (move date to end)
new_df = new_df[[*new_df.columns[new_df.columns != 'date'], 'date']]

new_df:

   id name   O1   O2   O3  date
0   1    A    X    X    X  2015
1   2    B  NaN    Y  NaN  2015
2   1    A    Z  NaN  NaN  2016
3   2    B    Z  NaN  NaN  2016
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57