1

I am trying to convert a dataframe with repeating rows into columns as follows

INPUT
Key  |  Value
A    |  1
B    |  2
C    |  3
A    |  4
B    |  5
C    |  6

EXPECTED OUTPUT
A  |  B  |  C
1  |  2  |  3
4  |  5  |  6

There are a lot of options like pivot(), unstack(), groupby(), etc. But, I was unsure of using it with just 2 columns as shown in the input.

Gagan G
  • 31
  • 1

1 Answers1

2

Its not a straight-forward pivot. Do this using df.pivot with df.apply and Series.dropna:

In [747]: x = df.pivot(index=None, columns='Key', values='Value').apply(lambda x: pd.Series(x.dropna().to_numpy()))

In [748]: x
Out[748]: 
Key    A    B    C
0    1.0  2.0  3.0
1    4.0  5.0  6.0

Explanation:

Let's break it down:

First you pivot your df like this:

In [751]: y = df.pivot(index=None, columns='Key', values='Value')

In [752]: y
Out[752]: 
Key    A    B    C
0    1.0  NaN  NaN
1    NaN  2.0  NaN
2    NaN  NaN  3.0
3    4.0  NaN  NaN
4    NaN  5.0  NaN
5    NaN  NaN  6.0

Now we are close to your expected output, but we need to remove Nan and collapse the 6 rows into 2 rows.

For that, we convert each column to a pd.Series and dropna():

In [753]: y.apply(lambda x: pd.Series(x.dropna().to_numpy()))
Out[753]: 
Key    A    B    C
0    1.0  2.0  3.0
1    4.0  5.0  6.0

This is your final output.

Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58