0

Suppose I have a data frame df

column A  column B   column C
large     prom2      34
large     prom1      21
large     prom1      12
large     prom2      8
medium    prom2      5
medium    prom1      7
medium    prom1      12
medium    prom2      12
small     prom1      16
small     prom1      14
small     prom2      12
small     prom1      14

I want to make an analysis of variance (ANOVA), therefore I want to prepare my table. column A value is an index, the value in column B is a new column, and C is a value of the new table. Hence the table should be like this:

col_index  col_prom1  col_prom2
   large      21         34
   large      12         8
   medium     7          5
   medium     12         12
   small      16         12  
   small      14         NaN
   small      14         NaN
Nuri Taş
  • 3,828
  • 2
  • 4
  • 22
Dika
  • 1

1 Answers1

0

You are looking for pd.DataFrame.pivot():

Working minimal example for your use case is:

df = pd.DataFrame([['large', 'prom2', 34], ['large', 'prom1', 21]], columns=['column A', 'column B', 'column C'])
df = df.pivot_table(index='column A', columns=['column B'], values='column C').reset_index().rename_axis(None, axis=1).rename(columns={df.index.name:'0', 'column A': 'col_index', 'prom1': 'col_prom1', 'prom2': 'col_prom2'})
df

This will yield:

    col_index   col_prom1   col_prom2
0   large       21          34

Disclaimer: This question is a possible duplicate.

Please make sure to not create duplicates.

ABC
  • 189
  • 9