1

If we have the following df,

df

    A   A   B   B   B
0  10   2   0   3   3
1  20   4  19  21  36
2  30  20  24  24  12
3  40  10  39  23  46

How can I combine the content of the columns with the same names? e.g.

    A   B  
0  10   0   
1  20  19  
2  30  24
3  40  39
4  2   3
5  4   21
6  20  24
7  10  23
8  Na  3
9  Na  36
10 Na  12
11 Na  46

I tried groupby and merge and both are not doing this job.

Any help is appreciated.

Hazem
  • 380
  • 1
  • 5
  • 14

1 Answers1

2

If columns names are duplicated you can use DataFrame.melt with concat:

df = pd.concat([df['A'].melt()['value'], df['B'].melt()['value']], axis=1, keys=['A','B'])
print (df)
       A   B
0   10.0   0
1   20.0  19
2   30.0  24
3   40.0  39
4    2.0   3
5    4.0  21
6   20.0  24
7   10.0  23
8    NaN   3
9    NaN  36
10   NaN  12
11   NaN  46

EDIT:

uniq = df.columns.unique()
df = pd.concat([df[c].melt()['value'] for c in uniq], axis=1, keys=uniq)
print (df)
       A   B
0   10.0   0
1   20.0  19
2   30.0  24
3   40.0  39
4    2.0   3
5    4.0  21
6   20.0  24
7   10.0  23
8    NaN   3
9    NaN  36
10   NaN  12
11   NaN  46
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    Thank you very much @jezrael. I've never heard of .melt. If I want to automate the process, I should replace the 'A' and 'B' by a loop over df.columns, right? – Hazem Nov 12 '20 at 10:38