1

I'm scraping tweets and saving them with pandas. One string isn't saving right. This is my code:

    for item in client.dataset(run["defaultDatasetId"]).iterate_items():
        tweet_df = pd.DataFrame()

        #TWEET STRING + RETWEET
        tweet_df['Date'] = [item['created_at'][0:19]]
        if item['is_retweet'] == True:
            tweet_df['Retweet'] = ["True"]
        else:
            tweet_df['Retweet'] = ["False"]
        tweet_df['Tweet'] = item['full_text']

The string which isn't saving correctly is: "The Starship team is go for prop load; team is keeping an eye on the weather

I am saving the dataframe and looking at it on a csv:

column D                                        column E
Tweet                                       |
________________________________________________________________________
The Starship team is go for prop load       | team is keeping an eye on the weather 

The csv column has no title and doesn't exist in the dataframe. There are several strings where this is happening, but 90%+ of them are saving fine.

I have tried a regular pandas save.

JonSG
  • 10,542
  • 2
  • 25
  • 36
  • 2
    Is the core issue that `;` is being interpreted as a column separator? – JonSG Apr 21 '23 at 13:59
  • 1
    Your question needs a minimal reproducible example consisting of sample input, expected output, actual output, and only the relevant code necessary to reproduce the problem. See [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for best practices related to Pandas questions. – itprorh66 Apr 21 '23 at 14:00
  • rather than a pandas issue, looking at your code, is it an issue with how `client.dataset(run["defaultDatasetId"])` is parsing your data? Perhaps that method accepts a specifier for delimiters. – JonSG Apr 21 '23 at 14:05
  • 1
    @JonSG Yeah, pandas interpreting it as separator. Thank you – John Langham Apr 21 '23 at 15:28
  • @JohnLangham, can you share a reproducible example *as text* and that triggers the issue ? – Timeless Apr 21 '23 at 15:29

0 Answers0