1

I have a JSON file like below:

{
  "name":"A",
  "age":19
}
{
  "name":"B",
  "age":20
}

So basically the file contains a list of people.

I tried to use json.loads(str_content) in Python 3, but it returned the error of json.decoder.JSONDecodeError: Extra data:.

I checked with an online JSON parser (http://json.parser.online.fr) and it told me the same problem.

How to parse a JSON file without a root element but a list of JSON objects?

mommomonthewind
  • 4,390
  • 11
  • 46
  • 74

2 Answers2

3

The issue is that the string you are trying to parse is not a valid JSON document. It is actually a concatenation of JSON documents. So the simple json.loads() will not work.

You can use instead something based on https://docs.python.org/3/library/json.html#json.JSONDecoder.raw_decode . E.g: (code is a bit ugly but the logic should be clear):

import json

s = """{
  "name":"A",
  "age":19
}
{
  "name":"B",
  "age":20
}"""

def iter_jsons(s):
    decoder = json.JSONDecoder()

    i = 0
    while True:
        doc, i2 = decoder.raw_decode(s[i:].strip())
        yield doc
        if i == i2:
            break
        i= i2

print(list(iter_jsons(s)))

[{'name': 'A', 'age': 19}, {'name': 'B', 'age': 20}]

Guillaume
  • 5,497
  • 3
  • 24
  • 42
  • If there are no nested dicts you could simply look for the closing brace and parse up to that point. – tripleee Nov 22 '18 at 09:00
  • I can't find anything in the json standard that indicates that a JSON document must have a root element. – TZubiri Nov 22 '18 at 09:08
  • @TomasZubiri It is not explicitly stated as text in https://tools.ietf.org/html/rfc7159#section-2 , but the ABNF indicates that a JSON-text is a SINGLE value – Guillaume Nov 22 '18 at 10:08
  • 2
    @TomasZubiri: I believe the standard indicates that a JSON document must either be a single JSON object enclosed in `{`, `}` brackets or a comma delimited list of them enclosed in `[`, `]` brackets. This answer looks like a very clever workaround allowing this non-compliant input to be decoded into the list it should have been in the first place. – martineau Nov 22 '18 at 10:11
  • @tripleee: that is correct, but I'd rather not assume anything about the content I am trying to parse. – Guillaume Nov 22 '18 at 10:15
1

This works like a charm

import json        
json_file = "myfile.json"

objects = json.load(open(json_file))

for person in objects:
    name = person['name']
    print(name)
SoSpoon
  • 156
  • 2
  • 6