The raw_decode(s)
method from json.JSONDecoder
sounds like what you need. To quote from its doc string:
raw_decode(s): Decode a JSON document from s (a str beginning with a JSON document) and return a 2-tuple of the Python representation and the index in s where the document ended. This can be used to decode a JSON document from a string that may have extraneous data at the end.
Example usage:
import json
s = """{
"areaId": "Tracking001",
"areaName": "Learning Theater Indoor",
"color": "#99FFFF"
}
{
"areaId": "Tracking001",
"areaName": "Learning Theater Indoor",
"color": "#33CC00"
}"""
decoder = json.JSONDecoder()
v0, i = decoder.raw_decode(s)
v1, _ = decoder.raw_decode(s[i+1:]) # i+1 needed to skip line break
Now v0
and v1
hold the parsed json values.
You may want to use a loop if you have thousands of values:
import json
with open("some_file.txt", "r") as f:
content = f.read()
parsed_values = []
decoder = json.JSONDecoder()
while content:
value, new_start = decoder.raw_decode(content)
content = content[new_start:].strip()
# You can handle the value directly in this loop:
print("Parsed:", value)
# Or you can store it in a container and use it later:
parsed_values.append(value)
Using this code for 1000 of above json values took about 0.03 seconds on my computer. However, it will become inefficient for larger files, because it always reads the complete file.