1

I'm having a problem collecting Arabic tweets and save them in a CSV file

when I open the CSV file the tweets is like this

enter image description here

here is the code


import tweepy
import csv


# Twitter API credentials

consumer_key = "..."
consumer_secret = ".."
access_key = "..."
access_secret = "...."

auth= tweepy.OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_key,access_secret)
api= tweepy.API(auth,wait_on_rate_limit=True)


csvFile=open('tweets.csv','a',newline='')
csvWriter=csv.writer(csvFile)
#truncated=False,
for tweet in tweepy.Cursor(api.search,q="اكتئاب",since="2021-01-30",truncated=False,tweet_mode="extended", count=1).items():

    if (not tweet.retweeted) and ('RT @' not in tweet.full_text):
        csvWriter.writerow([tweet.full_text.encode('utf-8-sig')])

please I need your help :'(

2 Answers2

1
  1. For the empty lines you get, see this answer:
    • add the parameter newline='' to the open(...) statement
  2. To get the full tweet text (280 chars), use Extended Mode when invoking the API and/or the Cursor()
    • tweet_mode='extended'
    • and use the parameter full_text instead of just text to get the text of each tweet.
    • You'll also need to handle retweets slightly differently.
  3. For the Full URLs, see this other answer:
    for url in status.entities['urls']:
        links = url['expanded_url']
    
aneroid
  • 12,983
  • 3
  • 36
  • 66
  • I've tried the tweet_mode and full_text but still not getting the complete tweet, see the code I've edited . – Reem Abdulrhman Jan 30 '21 at 16:43
  • Give an example of a tweet where the `full_text` is not the _full_ `full_text`. (Actual tweet text visible on twitter vs `full_text`.) – aneroid Jan 30 '21 at 17:05
  • thank you I check again and its work .. but I have a problem with the CSV file please check my update .. :( – Reem Abdulrhman Jan 30 '21 at 23:18
  • Put the `encoding='utf-8-sig'` part in `open`, not on `full_text`. _Edit:_ I see you've answered that part yourself below. – aneroid Jan 31 '21 at 01:35
  • Also, use [`open` as a context manager](https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files): `with open('tweets.csv','a',newline='', encoding='utf-8-sig') as csvFile:`. The for loops, if blocks, etc. should all be indented in a level under that. – aneroid Jan 31 '21 at 01:39
  • Another thing - don't modify your existing question to ask a new question once you already have an answer which solved the original question. Generally, ask a new question with the specifics of just the new/next problem. – aneroid Jan 31 '21 at 01:40
0

I found my answer that if I add these two lines to my code it will fix it

#coding:utf8
csvFile=open('tweets.csv','a',newline='',encoding='utf-8-sig')

the source