Let's say I have a data frame df
:
C1 C2 C3 C4 C5
0 [A] [1] s1 [123] t1
1 [A] [1] s2 321 t2
2 [A,B] [1,2] s3 [777,111] t3
3 [B] [2] s4 145 t4
4 [B] [2] s5 [990] t5
5 [A,B,B] [1,2,2] s6 [124,125,765] t6
6 [A,A] [1,3] s7 119 t7
I want to explode everything out, so I have been doing
df = df.apply(pd.Series.explode)
However, this gives me ValueError: cannot reindex from a duplicate axis
. I have traced the culprit to the row 6 (last row) of df
. I understood when I got this before when I had things in C1
that were not the same length as what was in C2
. But I don't understand what's wrong with exploding that last row.
If I do pd.DataFrame([[['A','B'],[1,2],'s7',119,'t7']]).apply(pd.Series.explode()
, it works fine and as expected giving me the following:
C1 C2 C3 C4 C5
0 A 1 s7 119 t7
1 A 3 s7 119 t7
I can't figure out why that last row causes an error when it is part of the whole data frame. I have check the index and it is all unique.