1

My DataFrame looks like this,

df = 
index | A   | B   | C
0     |00456|text1|date1
1     |00443|text1|date2
2     |00456|text1|date3
3     |00231|text2|date4
4     |00231|text3|date1
5     |00456|text2|date1

I want to end up with the following,

df =
index | Α   | B_1 |B_2  |B_3  |C_1...
0     |00456|text1|text1|text2|...
1     |00443|text1|NaN  |NaN  |...
2     |00231|text2|text3|NaN  |...

It is not important having the _1 suffix or the NaN values, what I want to achieve is have all the values in the same line based on the contents of column A.

What is the best way to achieve that?

Note that I have done this using a what I think is an over-complicated method involving operations that look like,

groups = df.groupby(["A"])
df = pd.concat((groups.A.apply(lambda x: pd.Series(data=x.values)).unstack(),
                groups.B.apply(lambda x: pd.Series(data=x.values)).unstack(),
                groups.C.apply(lambda x: pd.Series(data=x.values)).unstack()),
                keys = ['A', 'B' 'C'], axis=1)
ealiaj
  • 1,525
  • 1
  • 15
  • 25

1 Answers1

1

Use GroupBy.cumcount for counter with reshape by DataFrame.set_index and DataFrame.unstack, last flatten MultiIndex in columns and convert index to column:

g = df.groupby('A').cumcount() + 1
df1 = df.set_index(['A', g]).unstack()
df1.columns = [f'{a}_{b}' for a, b in df1.columns]
df1 = df1.reset_index()
print (df1)
     A    B_1    B_2    B_3    C_1    C_2    C_3
0  231  text2  text3    NaN  date4  date1    NaN
1  443  text1    NaN    NaN  date2    NaN    NaN
2  456  text1  text1  text2  date1  date3  date1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • given that there could possibly many `NaN` values within each `B_#` group, is there a way to shift them to the left? Their location does not matter as long as they are withing the same group. – ealiaj Aug 06 '19 at 13:01
  • @ealiaj - Not sure if understand, is possible change sample data for your problem? – jezrael Aug 06 '19 at 13:03
  • I added it as a separate question [here](https://stackoverflow.com/questions/57376881/shift-column-values-to-the-left-with-group-constraints) – ealiaj Aug 06 '19 at 13:12