7

If you have multiple columns with the same name in a dataframe, how do you remove all of the columns except the first one?

ayhan
  • 70,170
  • 20
  • 182
  • 203
JungleDiff
  • 3,221
  • 10
  • 33
  • 57
  • You can use the `pandas` function `drop_duplicates` with parameter `keep=first` – MMF Jun 06 '17 at 21:45

1 Answers1

22

Let df be a dataframe with two duplicated columns:

df = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]], columns=("a","a","b"))
#   a  a  b
#0  1  2  3
#1  4  5  6
#2  7  8  9

Find out which column names are not duplicated, and keep them:

df1 = df.loc[:, ~df.columns.duplicated()]
#   a  b
#0  1  3
#1  4  6
#2  7  9
DYZ
  • 55,249
  • 10
  • 64
  • 93