0

I streamed tweets using the following code

class CustomStreamListener(tweepy.StreamListener):
    def on_data(self, data):
        try:
            with open('brasil.json', 'a') as f:
                f.write(data)
                return True
        except BaseException as e:
            print("Error on_data: %s" % str(e))
        return True

Now I have a json file (brasil.json). I want to open it on python to do sentiment analysis but I can't find a way. I managed to open the first tweet using this:

with open('brasil.json') as f:
    for line in f:
        tweets.append(json.loads(line))

but it doesn't read all the other tweets. Any idea?

Diego Marino
  • 79
  • 1
  • 13
  • Your code works when I test it. What is the length of your final `tweets` list? I ran your `CustomStreamListener` for about a minute and got 1,813 tweets, then ran your code to read the saved `json` and `tweets` list is 1,813 long. So, I can't reproduce reading only one tweet, it reads all tweets. Check the length of your `tweets` list again? – chickity china chinese chicken Dec 15 '18 at 01:10
  • I also have around 2000 tweets (if I open it with excel). But in python I get this error: json.decoder.JSONDecodeError: Expecting value: line 2 column 1 (char 1) – Diego Marino Dec 15 '18 at 02:29
  • It seems like there is missing data (the "Expecting value") or the data written is inconsistent. What does `"line 2 column 1 (char 1)"` look like when you load the file in in Excel? (line 2, column 1) – chickity china chinese chicken Dec 15 '18 at 02:40
  • When I open in excel I have all the tweets in the odd number if rows. The even numbers are blank – Diego Marino Dec 15 '18 at 02:42
  • ok there are 2 easy fixes: you can either 1) read only the odd number rows (`for n, line in enumerate(f):`), or 2) use `try / except` with `except json.decoder.JSONDecodeError: pass`. Try whichever one you prefer and see if it works. Or, if you'd prefer I could put them in an answer, as it's difficult to format code in comments – chickity china chinese chicken Dec 15 '18 at 02:49
  • Please, i would appreciate that – Diego Marino Dec 15 '18 at 02:53
  • No sample data was provided but I guess you have a JSON stream, so [How to extract multiple JSON objects from one file?](https://stackoverflow.com/questions/27907633/how-to-extract-multiple-json-objects-from-one-file) might help future visitors. – ggorlen Sep 29 '22 at 18:42

1 Answers1

1

From comments: after examining the contents of the json data-file, all the tweets are in the odd number if rows. The even numbers are blank.

This caused a json.decoder.JSONDecodeError.

There are two ways to handle this error, either read only the odd rows or use exception-handling.

using odd rows:

with open('brasil.json') as f:
    for n, line in enumerate(f, 1):
        if n % 2 == 1: # this line is in an odd-numbered row
            tweets.append(json.loads(line))

exception-handling:

with open('brasil.json', 'r') as f:
    for line in f:
        try:
            tweets.append(json.loads(line))
        except json.decoder.JSONDecodeError:
            pass # skip this line 

try and see which one works best.