I have the Twitter dataset (multiple JSON files), but let's start from one file. I have to parse JSON objects to python but json.loads()
only parse one object. A similar question is asked here but solutions are not working or good enough.
1- I can not convert JSON objects into the list as it is not efficient and I have too much data. Also proposed solutions are based on "\n" while my Twitter data objects end like }{
there is no newline
and I can not add manually. (Twitter objects are also not line by line)
2- The second solution is JSONStream
and there is not much available about it on official documentation.
3- Is there any other efficient way? One I have in consideration is using MongoDB
. but I never worked on MongoDB
. so I don't know if this is possible with this or not.
below picture shows the length of tweet object and }{
with open('sampledata.json','r',encoding='utf8') as json_file:
#for i in json_file:
while(True):
dataobj = json.load(json_file)
print(dataobj)
print("Printing each JSON Decoded Object")
Error: As there are 287 lines for one object.
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 287 column 2 (char 10528)