0

So the problem is the following. I have dataframe:

   a  a  b  a  b
0  0  1  2  1  2
1  3  4  5  4  5

For each column name, I want to remove it's duplicate columns. It is difficult to explain. The resulting dataframe should be:

   a  a  b
0  0  1  2
1  3  4  5

I have achived with drop_duplicates() with the transpose of df[['column_namee']] for each column, but its too slow.

I am wondering if there is any fastest way to solve it.

rgralma
  • 145
  • 7

1 Answers1

2

IIUC

df=df.loc[:,~(df.T.duplicated()&df.columns.duplicated())]
Out[184]: 
   a  a  b
0  0  1  2
1  3  4  5
BENY
  • 317,841
  • 20
  • 164
  • 234