2

I have trouble understanding the way the object_hook function parses the json it is supplied by json.loads.

Consider the following:

import json

j = '{"key": "Program1",\
    "value": {"codes": [], "name": "John Doe", "cities": ["New York", "Houston"]}}'

class Humans:
    def __init__(self, source, codes, name, cities):
        self.source = source
        self.codes = codes
        self.name = name
        self.cities = cities

def as_Humans(dct):
    print(dct)
    source = dct['key']
    value = dct['value']
    codes = value['codes']
    name = value['name']
    cities = value['cities']
    return(Humans(source, codes, name, cities))


print(json.loads(j))
human = json.loads(j, object_hook = as_Humans)        
print(human.name)

When I run this I get the following errors:

{'key': 'Program1', 'value': {'name': 'John Doe', 'cities': ['New York', 'Houston'], 'codes': []}}
{'name': 'John Doe', 'cities': ['New York', 'Houston'], 'codes': []}
Traceback (most recent call last):
  File "stackoverflow.py", line 25, in <module>
    human = json.loads(j, object_hook = as_Humans)        
  File "/usr/lib/python3.4/json/__init__.py", line 331, in loads
    return cls(**kw).decode(s)
  File "/usr/lib/python3.4/json/decoder.py", line 343, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.4/json/decoder.py", line 359, in raw_decode
    obj, end = self.scan_once(s, idx)
  File "stackoverflow.py", line 16, in as_Humans
    source = dct['key']
KeyError: 'key'

Clearly this does not work the way I think it does: from the print(dct) in the hook, I can see that only a subset of the original json got passed, and since that part does not have the key "key", well, I get a key error.

If I modify this in order to only have {"codes": [], "name": "John Doe", "cities": ["New York", "Houston"]} in the json (and modify everything else accordingly), it works.

Interestingly enough, if the json is '{"value": {"codes": [], "name": "John Doe", "cities": ["New York", "Houston"]}}', it fails in the same way (key error on "value").

What am I missing? Is my JSON string invalid? Is it possible to do this without tweaking the json first?

zlr
  • 789
  • 11
  • 22

1 Answers1

3

object_hook is applied to every object/dict nested in the json, from the deepest to the highest. So {"codes": [], "name": "John Doe", "cities": ["New York", "Houston"]} is provided to the hook first.

Your example can be fixed with the following object_hook:

def as_humans(dct):
    if not 'key' in dct:
        print("IT'S A NOT HUMAN", dct)
        return dct
    print("IT'S A HUMAN", dct)
    source = dct['key']
    value = dct['value']
    codes = value['codes']
    name = value['name']
    cities = value['cities']
    return (Humans(source, codes, name, cities))

and the ouput will be:

{'value': {'cities': ['New York', 'Houston'], 'name': 'John Doe', 'codes': []}, 'key': 'Program1'}
IT'S A NOT HUMAN {'cities': ['New York', 'Houston'], 'name': 'John Doe', 'codes': []}
IT'S A HUMAN {'value': {'cities': ['New York', 'Houston'], 'name': 'John Doe', 'codes': []}, 'key': 'Program1'}
John Doe
OcuS
  • 5,320
  • 3
  • 36
  • 45
  • 1
    Stellar! I was missing the `return(dct)` when the key isn't found. There's a reason they call you "the druid" ! – zlr Jul 08 '16 at 16:48
  • Related, but for yaml: http://stackoverflow.com/questions/19439765/is-there-a-way-to-construct-an-object-using-pyyaml-construct-mapping-after-all-n – zlr Aug 10 '16 at 22:46