0

in my project I am loading every other day data from Twitter an append it to a csv file. This procedure leads to exact duplicates of tweets in my csv file. That's why I want to remove these exact duplicates.

However, when I run the following code:

import pandas as pd
data = pd.read_csv("Hashtags.csv", engine="python")

data.drop_duplicates(subset=None, inplace=True)
data.to_csv("Hashtags.csv",index = False)

and then try to load the csv file I get the following error

pandas.errors.ParserError: ',' expected after '"'

Before I dropped the duplicates I had no problems with loading the file. It seems almost like the drop_duplicates function inserts unnecessary " signs. Does anyone know how to solve this problem?

Thank you very much in advance!

1 Answers1

0

I have no idea why it works now. But I changed the code on how I save the pandas Dataframe to a csv.file and now it works.

fname = 'Hashtags'
data.drop_duplicates(subset=None, inplace=True)

with open('%s.csv' % fname, 'w', encoding="utf8") as file:
    data.to_csv(file,index = False)