Removing duplicate columns with same column name in pandas

Question

So the problem is the following. I have dataframe:

   a  a  b  a  b
0  0  1  2  1  2
1  3  4  5  4  5

For each column name, I want to remove it's duplicate columns. It is difficult to explain. The resulting dataframe should be:

   a  a  b
0  0  1  2
1  3  4  5

I have achived with drop_duplicates() with the transpose of df[['column_namee']] for each column, but its too slow.

I am wondering if there is any fastest way to solve it.

Somewhat related: do you have to use duplicate column names? That needlessly complicates subsequent analysis. — Peter Leimbigler, Mar 04 '20 at 15:49
Check out kalu's answer here: https://stackoverflow.com/a/32961145 — Peter Leimbigler, Mar 04 '20 at 15:52
Yes. the column names are in real string dates ('2020-02-03') and I want to remove the the duplicate columns of the same date. Thats the reason why I want duplicate columns — rgralma, Mar 05 '20 at 10:04

score 2 · Answer 1 · answered Mar 04 '20 at 15:51

2

IIUC

df=df.loc[:,~(df.T.duplicated()&df.columns.duplicated())]
Out[184]: 
   a  a  b
0  0  1  2
1  3  4  5

answered Mar 04 '20 at 15:51

BENY

1

Thats actually no working if a and b have the same columns value – rgralma Mar 05 '20 at 10:02

1 Answers1