2

I have been to various links, did hours of googling but could not find a simple way to convert the JSON data received from Tweepy StreamingListener() to python dictionary so that it can be used with pandas DataFrame. What i did was save the data received to a json file and then read using json library. But there are various errors. I've also tried saving stream data to list and then convert to dictionary but of no use.

Here is my code:

class StreamCollector(StreamListener):

def __init__(self, api=None):
    super(StreamListener, self).__init__()
    self.num_tweets = 0

def on_data(self, raw_data):
    try:
        with open('java.json', 'a') as f:
            f.write(raw_data)
        self.num_tweets += 1
        if self.num_tweets > 4:
            return False
        else:
            return True
    except BaseException as base_ex:
        print(base_ex)
        return False

def on_error(self, status_code):
    print("Error Status code: --> {}".format(status_code))
    return True

try:
    twitterStream = Stream(auth, StreamCollector())
    twitterStream.filter(track=['#Java'])

    tweetDict = json.loads('java.json')
    print(type(tweetDict))
    print(tweetDict)

except TweepError as e:
    print(e)

The above code produces following error:

raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

EDIT- I checked my JSON and it appears to me that instead of one object, JSON has multiple objects which throws an error eg:
{"name":"abc","created_at":"abc date"} //No comma {"name":"xyz","created_at":"xyz date"}

The JSON file does not even have a root object or an array How should i correct it?

Harsh
  • 372
  • 2
  • 15
  • What are the contents of `raw_data`? Does it look like a json file to you? If it does, you can use `json.loads` at the raw data step. no need to write to file. – wkzhu Nov 21 '17 at 16:28
  • `raw_data` will be coming dynamically from Twitter which in this case is my twitter data. The format of that data is **JSON** – Harsh Nov 21 '17 at 18:10
  • In that case you can use json.loads directly. Check this solution https://stackoverflow.com/a/18460958/7327411 - could be that the formatting is incorerct. – wkzhu Nov 21 '17 at 19:27

0 Answers0