I am working with python
and I have a file (data.json
) which contains multiple jsons but the whole of it is not a json.
So the file looks like that:
{ "_id" : 01, ..., "path" : "2017-12-12" }
{ "_id" : 02, ..., "path" : "2017-1-12" }
{ "_id" : 03, ..., "path" : "2017-5-12" }
at the place of ...
there are about 30 more keys which some of them have nested jsons (so my point is that each json above is pretty long).
Therefore, each of the blocks above at this single file are jsons but the whole of the file is not a json since these are not separated by commas etc.
How can I read each of these jsons separately either with pandas
or with simple python
?
I have tried this:
import pandas as pd
df = pd.read_json('~/Desktop/data.json', lines=True)
and it actually creates a dataframe where each row is about one json but it also create a column for each of the (1st level) keys of the json which makes things a bit more messy instead of putting the whole json directly in one cell.
To be more clear, I would like my output to be like this in a 'pandas' dataframe (or in another sensible data-structure):
jsons
0 { "_id" : 01, ..., "path" : "2017-12-12" }
1 { "_id" : 02, ..., "path" : "2017-1-12" }
2 { "_id" : 03, ..., "path" : "2017-5-12" }