I'm new to Python. I have a CSV-file with tweet entries formatted like this:
15,Oct 11,785816454042124288,/realDonaldTrump/status/785816454042124288,False,"Despite winning the second debate in a landslide (every poll), it is hard to do well when Paul Ryan and others give zero support!",DonaldTrump
and another
16,Oct 10,785563318652178432,/realDonaldTrump/status/785563318652178432,False,"Wow, @CNN got caught fixing their ""focus group"" in order to make Crooked Hillary look better. Really pathetic and totally dishonest!",DonaldTrump
In Python, I load the contents using Pandas like this:
data = pd.read_csv(arg, sep=',')
Now, I would like to clean the CSV-file and only save the user ID (3rd entry on each row) and the tweet itself (I think 6th row). As you see I split by using the sep=','. The problem is if some tweets contains commas, I don't want this character to be removed due to the splitting.. If only the separator between tweet number, date, user_id, and so on, would have been something other than comma, it would have been a lot easier. Any suggestions on how to do this? I just want a new CSV-file without the information that I don't need.