0

hi guys so i am working on a personal project in which i was searching for tweets containing specific keywords. I collected about 100 recent tweets for each of the keywords and saved them to variable x1_tweets, x2_tweets and x3_tweets. The data is basically a list of dictionaries and the fields look like this:

['created_at', 'id', 'id_str', 'text', 'truncated', 'entities', 'metadata', 'source', 'in_reply_to_status_id', 'in_reply_to_status_id_str', 'in_reply_to_user_id', 'in_reply_to_user_id_str', 'in_reply_to_screen_name', 'user', 'geo', 'coordinates', 'place', 'contributors', 'is_quote_status', 'retweet_count', 'favorite_count', 'favorited', 'retweeted', 'lang']

i then wanted to save the tweets(just the text) from each of the variables to json file. for that i defined a function(the function saves a list of dictionaries to a json file, obj being the list of dictionaries and filename being the name i want to save it as):

def save_to_json(obj, filename):
    with open(filename, 'w') as fp:
        json.dump(obj, fp, indent=4, sort_keys=True) 

In order to get only the tweets i implemented the following code:

for i, tweet in enumerate(x1_tweets):
    save_to_json(tweet['text'],'bat')

However i have had no success thus far, can anyone please guide me to the right direction? thanks in advance!

edit: I am using twitterAPI

  • Not exactly sure what your problem is, but check `pickle` for serialization. You can see an example here: https://stackoverflow.com/questions/11218477/how-can-i-use-pickle-to-save-a-dict – Ardweaden May 19 '20 at 09:31
  • hi, i want to use json for my output file. I know that the error is in the last bit of code but i just cant quite figure out what it is if you want i can show you my json file containing all the fields? – karan sethi May 19 '20 at 09:34
  • When you ask a question, you should share the error if you got one, or show what the output is and what the expected output should be. Otherwise it's impossible to answer. Nonetheless, it is likely related to this warning in `json` documentation: `Note Unlike pickle and marshal, JSON is not a framed protocol, so trying to serialize multiple objects with repeated calls to dump() using the same fp will result in an invalid JSON file.` – Ardweaden May 19 '20 at 09:40
  • https://docs.python.org/3/library/json.html – Ardweaden May 19 '20 at 09:40
  • Does this answer your question? [how to get only the text of the tweets into a json file](https://stackoverflow.com/questions/61884620/how-to-get-only-the-text-of-the-tweets-into-a-json-file) – Andy Piper May 19 '20 at 12:12

1 Answers1

0

First thing you need to do is change the below code as:

def save_to_json(obj, filename):
    with open(filename, 'a') as fp:
        json.dump(obj, fp, indent=4, sort_keys=True) 

You need to change the mode in which file is open because of the below reason.

w: Opens in write-only mode. The pointer is placed at the beginning of the file and this will overwrite any existing file with the same name. It will create a new file if one with the same name doesn't exist.

a: Opens a file for appending new information to it. The pointer is placed at the end of the file. A new file is created if one with the same name doesn't exist.

Also, there is no meaning of sort_keys as you are only passing a string and not a dict. Similarly, there is no meaning of indent=4 for strings.

If you need some indexing with the tweet text you can use the below code:

tweets = {}    
for i, tweet in enumerate(x1_tweets):
    tweets[i] = tweet['text']
save_to_json(tweets,'bat.json')

The above code will create a dict with index to the tweet and write to the file once all tweets are processed.