I have been to various links, did hours of googling but could not find a simple way to convert the JSON data received from Tweepy StreamingListener() to python dictionary so that it can be used with pandas DataFrame. What i did was save the data received to a json file and then read using json library. But there are various errors. I've also tried saving stream data to list and then convert to dictionary but of no use.
Here is my code:
class StreamCollector(StreamListener):
def __init__(self, api=None):
super(StreamListener, self).__init__()
self.num_tweets = 0
def on_data(self, raw_data):
try:
with open('java.json', 'a') as f:
f.write(raw_data)
self.num_tweets += 1
if self.num_tweets > 4:
return False
else:
return True
except BaseException as base_ex:
print(base_ex)
return False
def on_error(self, status_code):
print("Error Status code: --> {}".format(status_code))
return True
try:
twitterStream = Stream(auth, StreamCollector())
twitterStream.filter(track=['#Java'])
tweetDict = json.loads('java.json')
print(type(tweetDict))
print(tweetDict)
except TweepError as e:
print(e)
The above code produces following error:
raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
EDIT- I checked my JSON and it appears to me that instead of one object, JSON has multiple objects which throws an error
eg:
{"name":"abc","created_at":"abc date"} //No comma
{"name":"xyz","created_at":"xyz date"}
The JSON file does not even have a root object or an array How should i correct it?