0

I have a dataframe:

index    col1    col2     col3
    0       X       A      123
    1       X       A      456
    2       X       B      654
    3       X       B      321
    4       X       A      111

I'm trying to use pivot_table to get this df into the following format:

index       A       B
    0     123     654  
    1     456     321 
    2     111     NaN

I'm trying to use... pivot_table(index=df.index, columns=['col2'], values='col3'), however this just ends up in the following state:

index       A       B
    0     123     NaN  
    1     456     NaN 
    2     111     NaN
    3     NaN     654 
    4     NaN     321

How can I avoid this?

mbadd
  • 937
  • 1
  • 10
  • 19

2 Answers2

1

Still using pivot_table

pd.pivot_table(df,index=df.groupby('col2').cumcount(),columns='col2',values='col3')
Out[163]: 
col2      A      B
0     123.0  654.0
1     456.0  321.0
2     111.0    NaN
BENY
  • 317,841
  • 20
  • 164
  • 234
1

Your implementation of pivot_table is correct.

To combine records (in the pivoted dataframe), which you are asking about, you need first use pivot_table and then simply drop the NaNs from the pivoted dataframe using apply, as this answer shows:

# Perform spreadsheet-style pivot table
df2 = df.pivot_table(index=df.index, columns=['col2'], values='col3')
print df2

# Drop NaNs to shift rows up and combine records
df1 = df2.apply(lambda x: pd.Series(x.dropna().values))
print (df1)

The output is:

# df2
col2      A      B
0     123.0    NaN
1     456.0    NaN
2       NaN  654.0
3       NaN  321.0
4     111.0    NaN

# df1
col2      A      B
0     123.0  654.0
1     456.0  321.0
2     111.0    NaN
edesz
  • 11,756
  • 22
  • 75
  • 123