I try to clean some twitter data stored in a csv file using python in jupyter notebook, so I try this code:
unwanted_characters = [',', '@', '\n','&','_']
with open('facebook_Tweet.csv','r') as f:
with open('cleaned_facebook_Tweet.csv','w') as ff:
for unwanted in unwanted_characters:
ff.write(f.read().replace(unwanted,''))
tweety = pd.read_csv("cleaned_facebook_Tweet.csv", error_bad_lines=False)
tweety.head()
When I run this code I got this result:
tweet1:Time:Sun Dec 06 09:59:02 +0000 2020 tweet text:RT @_Aaron_Anthony_: Seen this of Facebook and it hit home.\n\nRemember this Christmas if someone pays \u00a320 for a gift for you and they get\u2026
tweet2:Time:Sun Dec 06 09:59:02 +0000 2020 tweet text:RT @TopAchat: Concours \ud83c\udf81 #PetitPapaTopAchat \ud83c\udf84\n\n\ud83d\udd25 + de 60 000 \u20ac de cadeaux \u00e0 gagner !\n\nCa continue avec le #Lot7 de 4333 \u20ac ! \ud83d\udd25\n\nPour partici\u2026
As you can see the unwanted characters stayed, and my code just remove the first unwanted character in my example was the ',' and keep the others example the '@' and the '\n'.
How can I fix my code ? and thanks a lot.