This may be redundant, but after reading previous posts and answers I still have not gotten my code to work. I have a very large file containing multiple json objects that are not delimited by any values:
{"_index": "1234", "_type": "11", "_id": "1234", "_score": 0.0, "fields": {"c_u": ["url.com"], "tawgs.id": ["p6427"]}}{"_index": "1234", "_type": "11", "_id": "786fd4ad2415aa7b", "_score": 0.0, "fields": {"c_u": ["url2.com"], "tawgs.id": ["p12519"]}}{"_index": "1234", "_type": "11", "_id": "5826e7cbd92d951a", "_score": 0.0, "fields": {"tawgs.id": ["p8453", "p8458"]}}
I've read that this is exactly what JSON-RPC is supposed to look like, but still can't achieve opening/parsing the file to create a dataframe in python.
I tried something of the format of:
i = 0
d = json.JSONDecoder()
while True:
try:
obj, i = d.raw_decode(s, i)
except ValueError:
return
yield obj
but it didn't work.
I've also tried a basic:
with open('output.json','r') as f:
data = json.load(f)
but am thrown the error:
JSONDecodeError: Extra data: line 1 column 184 (char 183)
Trying json.decode() with append didn't work either and returned data empty []
data = []
with open('es-output.json', 'r') as f:
for line in f:
try:
data.append(json.loads(line))
except json.decoder.JSONDecodeError:
pass # skip this line