2

I have this original dataframe:

                     column  data
datetime                         
2021-01-23 00:01:00       1    10
2021-01-23 00:01:00       2    15
2021-01-23 00:02:00       1    11
2021-01-23 00:03:00       1    12
2021-01-23 00:03:00       2    14
  • There may be more than one row with the same datetime, as illustrated in the example.

  • There may be more than just the two different values in column column, this is a simplified example.

  • All values are integers.

I want to create this new dataframe:

                      1    2
datetime                         
2021-01-23 00:01:00  10   15
2021-01-23 00:02:00  11  NaN
2021-01-23 00:03:00  12   14

Actions needed:

  • For every unique value in column column, create a new column with the value as the name of the column.

  • For every unique datetime, create a new row.

  • Fill the values according the data column from the original dataframe, with NaN if there is no value.

Code to create the original dataframe:

import io, pandas as pd

t = io.StringIO("""
datetime|column|data
2021-01-23 00:01:00|1|10
2021-01-23 00:01:00|2|15
2021-01-23 00:02:00|1|11
2021-01-23 00:03:00|1|12
2021-01-23 00:03:00|2|14""")
df = pd.read_csv(t, sep='|', parse_dates=['datetime']).set_index('datetime').astype(int)

Any help would be greatly appreciated. Thank you very much in advance!

Mike
  • 155
  • 2
  • 8
  • Bravo with the elegant way of asking a question. This question is answered https://stackoverflow.com/questions/47152691/how-can-i-pivot-a-dataframe and https://stackoverflow.com/questions/17298313/python-pandas-convert-rows-as-column-headers – Prayson W. Daniel Dec 07 '21 at 04:58

1 Answers1

1

pandas pivot_table will help you

pd.pivot_table(df, values ='data', index =['datetime'],columns =['column']).reset_index()

Output:

        datetime         1       2
0   2021-01-23 00:01:00 10.0    15.0
1   2021-01-23 00:02:00 11.0    NaN
2   2021-01-23 00:03:00 12.0    14.0
Subbu VidyaSekar
  • 2,503
  • 3
  • 21
  • 39