0

I have a csv file with duplicates that are only in the column named "file". I wrote the following line:

df = pd.read_csv(path_to_file, encoding='utf-8', sep=',')
df.drop_duplicates(subset="Fichier",keep='first',inplace=True) 

But it doesn't work. I even tried to do it via Excell but it doesn't work either..

Many thanks in advance!!

Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
Balkhrod
  • 81
  • 7
  • You can visit on [enter link description here](https://stackoverflow.com/questions/14984119/python-pandas-remove-duplicate-columns) – Hrushi Jun 16 '22 at 09:41

1 Answers1

1

You can try this, it works for me :

#In my case
metadata = pd.read_csv('CSV/data_full.csv', low_memory=False)

myresult = pd.Series(metadata.index, index=metadata['Fichier']).drop_duplicates()
Louis Chabert
  • 399
  • 2
  • 15