If you have multiple columns with the same name in a dataframe, how do you remove all of the columns except the first one?
Asked
Active
Viewed 1.4k times
7
-
You can use the `pandas` function `drop_duplicates` with parameter `keep=first` – MMF Jun 06 '17 at 21:45
1 Answers
22
Let df
be a dataframe with two duplicated columns:
df = pd.DataFrame([[1,2,3],[4,5,6],[7,8,9]], columns=("a","a","b"))
# a a b
#0 1 2 3
#1 4 5 6
#2 7 8 9
Find out which column names are not duplicated, and keep them:
df1 = df.loc[:, ~df.columns.duplicated()]
# a b
#0 1 3
#1 4 6
#2 7 9

DYZ
- 55,249
- 10
- 64
- 93