1504

I have this JSON in a file:

{
    "maps": [
        {
            "id": "blabla",
            "iscategorical": "0"
        },
        {
            "id": "blabla",
            "iscategorical": "0"
        }
    ],
    "masks": [
        "id": "valore"
    ],
    "om_points": "value",
    "parameters": [
        "id": "valore"
    ]
}

I wrote this script to print all of the JSON data:

import json
from pprint import pprint

with open('data.json') as f:
    data = json.load(f)

pprint(data)

This program raises an exception, though:

Traceback (most recent call last):
  File "<pyshell#1>", line 5, in <module>
    data = json.load(f)
  File "/usr/lib/python3.5/json/__init__.py", line 319, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.5/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.5/json/decoder.py", line 355, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting ',' delimiter: line 13 column 13 (char 213)

How can I parse the JSON and extract its values?

Karl Knechtel
  • 62,466
  • 11
  • 102
  • 153
michele
  • 26,348
  • 30
  • 111
  • 168
  • 1
    This question's status was discussed [here](https://meta.stackoverflow.com/q/381492/1394393). Community consensus was that this question was "good enough" to be left open after substantial edits. Please open a new discussion if you feel something has changed since that discussion. – jpmc26 May 19 '22 at 03:44
  • 4
    This question is being [discussed on meta](https://meta.stackoverflow.com/q/419062) for a second time. – cigien Jul 02 '22 at 05:25

3 Answers3

2193

Your data is not valid JSON format. You have [] when you should have {} for the "masks" and "parameters" elements:

  • [] are for JSON arrays, which are called list in Python
  • {} are for JSON objects, which are called dict in Python

Here's how your JSON file should look:

{
    "maps": [
        {
            "id": "blabla",
            "iscategorical": "0"
        },
        {
            "id": "blabla",
            "iscategorical": "0"
        }
    ],
    "masks": {
        "id": "valore"
    },
    "om_points": "value",
    "parameters": {
        "id": "valore"
    }
}

Then you can use your code:

import json
from pprint import pprint

with open('data.json') as f:
    data = json.load(f)

pprint(data)

With data, you can now also find values like so:

data["maps"][0]["id"]
data["masks"]["id"]
data["om_points"]

Try those out and see if it starts to make sense.

jpmc26
  • 28,463
  • 14
  • 94
  • 146
Justin Peel
  • 46,722
  • 6
  • 58
  • 80
  • serialized data is wrapped with [] , and when you read it in you need f.read(), that is if you use the standard. – radtek Dec 23 '14 at 18:43
  • 5
    Thanks for the solution. i'm getting a unicode symbol while printing it. (eg u'valore' ). How to prevent it? – diaryfolio Jan 30 '15 at 15:36
  • 6
    Nice but python adds a `u'` before each key. Any idea why? – CodyBugstein Jul 05 '15 at 07:14
  • 7
    That is why your text is type unicode not string. Most time it is better to have text in unicode for german umlauts and for sharing text results with other modules/programs etc. . So you're good! – Michael P Aug 29 '15 at 11:56
  • How to know size of the maps array to control index in this example?data["maps"][0]["id"] - Here 0 hard coded. – Karthi Apr 26 '17 at 02:34
  • isn't there a resource leak because the handle to `data.json` is never closed? – Max Heiber Jan 16 '18 at 19:37
  • In python 3, my json file is an array [] of jsons, it's called valid json by online checkers, and with these commands it loaded perfectly. Perhaps the definitions have changed circa 2018? – Nikhil VJ Feb 19 '18 at 15:19
  • @nikhilvj json doesn't need to have `{}` at the root level. It can start with an array at the root level (`[]`) – Justin Peel Feb 22 '18 at 18:08
  • https://stackoverflow.com/a/27415238/3299397 need this piece for it to work. – Kyle Bridenstine Jul 13 '18 at 19:27
  • What exception will be thrown if the with call fails? Should this be wrapped in a try catch? – Kyle Bridenstine Aug 21 '18 at 19:00
  • @JustinPeel is there any chance that the order of the elements that are in the array (value of maps) will be changed. For example storing it into a data store (elastic search or any database) and then getting it back from there. – viveksinghggits Apr 22 '19 at 09:17
322

Your data.json should look like this:

{
 "maps":[
         {"id":"blabla","iscategorical":"0"},
         {"id":"blabla","iscategorical":"0"}
        ],
"masks":
         {"id":"valore"},
"om_points":"value",
"parameters":
         {"id":"valore"}
}

Your code should be:

import json
from pprint import pprint

with open('data.json') as data_file:    
    data = json.load(data_file)
pprint(data)

Note that this only works in Python 2.6 and up, as it depends upon the with-statement. In Python 2.5 use from __future__ import with_statement, in Python <= 2.4, see Justin Peel's answer, which this answer is based upon.

You can now also access single values like this:

data["maps"][0]["id"]  # will return 'blabla'
data["masks"]["id"]    # will return 'valore'
data["om_points"]      # will return 'value'
Community
  • 1
  • 1
Bengt
  • 14,011
  • 7
  • 48
  • 66
  • Referring to 2.6 documentation (https://docs.python.org/2.6/library/io.html), opening a file in the "with" context will automatically close the file. – Steve S. Jun 16 '15 at 01:54
  • 1
    @SteveS. Yes, but not before the context is left. `pprint`ing in the `with`-context keeps the `data_file` open longer. – Bengt Jun 16 '15 at 17:45
  • Is there a way to access like data.om_points or data.masks.id? – Gayan Pathirage Mar 15 '17 at 10:16
  • This works except when I try to use a numbered index like `data["maps"][0]["id"]` I see error: `KeyError: 0` – Patrick Schaefer Apr 03 '17 at 19:25
  • 1
    @GayanPathirage you access it like `data["om_points"]` , `data["masks"]["id"]`. The idea is you can reach any level in a dictionary by specifying the 'key paths'. If you get a `KeyError` exception it means the key doesn't exist in the path. Look out for typos or check the structure of your dictionary. – Nuhman May 25 '18 at 04:55
5

Here you go with modified data.json file:

{
    "maps": [
        {
            "id": "blabla",
            "iscategorical": "0"
        },
        {
            "id": "blabla",
            "iscategorical": "0"
        }
    ],
    "masks": [{
        "id": "valore"
    }],
    "om_points": "value",
    "parameters": [{
        "id": "valore"
    }]
}

You can call or print data on console by using below lines:

import json
from pprint import pprint
with open('data.json') as data_file:
    data_item = json.load(data_file)
pprint(data_item)

Expected output for print(data_item['parameters'][0]['id']):

{'maps': [{'id': 'blabla', 'iscategorical': '0'},
          {'id': 'blabla', 'iscategorical': '0'}],
 'masks': [{'id': 'valore'}],
 'om_points': 'value',
 'parameters': [{'id': 'valore'}]}

Expected output for print(data_item['parameters'][0]['id']):

valore
np_6
  • 514
  • 1
  • 6
  • 19
Ramapati Maurya
  • 654
  • 9
  • 11
  • If we would like add a column to count how many observations does "maps" have, how could we write this function? – Chenxi Jun 07 '18 at 17:24