0

I have a Dataframe 'df1' with a simplified format of:

Hours   Material   Amount
10       Ni         .2
10       Mg         .6
10       Si         .25
20       Ni         .4
20       Mg         .9
20       Si         .6
30       Ni         .8
30       Mg         1.2
30       Si         .9

The actual data sets have more columns however the number of rows is determined by the number of hour samples (e.g. 3 rows per time sample).

I want to look at trends of material content over time, I tried to create an empty DataFrame 'df2' where columns are 'Material' and rows are 'Hours':

      Ni    Mg    Si
10    NaN   NaN   NaN
20    NaN   NaN   NaN
30    NaN   NaN   NaN

I can't find a way to populate the DataFrame with the respective data from 'df1'. I suspect there is a better way to do this, any suggestions?

Iceberg_Slim
  • 422
  • 6
  • 16
  • 1
    @QuangHoang *I can't find a way to populate the DataFrame with the respective data from 'df1'.* – splash58 Oct 16 '19 at 14:30
  • Interesting! Thank you @QuangHoang! I tried `df2 = pd.DataFrame({}, index=df1.Hours.unique(), columns=df1.Material.unique())` - But the cells still contain NaN. Any ideas? – Iceberg_Slim Oct 16 '19 at 14:37
  • 1
    You need pivot - `df.pivot('Hours','Material','Amount')` – splash58 Oct 16 '19 at 14:38

0 Answers0