1

I'm trying to grab specific values as I iterate through a list of dictionaries that contain nested dictionaries and lists.

This is roughly what my imported json data looks like(simplified). It's a list of dictionaries with nested dictionaries and nested lists.

# What a single dictionary looks like prettified

[{ 'a':'1',
'b':'2',
'c':'3',
'd':{ 'ab':'12',
      'cd':'34',
      'ef':'56'},
'e':['test', 'list'],
'f':'etc...'
}]

# What the list of dictionaries looks like

dict_list = [{ 'a':'1', 'b':'2', 'c':'3', 'd':{ 'ab':'12','cd':'34', 'ef':'56'}, 'e':['test', 'list'], 'f':'etc...'}, { 'a':'2', 'b':'3', 'c':'4', 'd':{ 'ab':'23','cd':'45', 'ef':'67'}, 'e':['test2', 'list2'], 'f':'etcx2...'},{},........,{}]

This is the code I originally had which only iterates through the list of dictionaries.

for dic in dict_list:
    for val in dic.values():
        if not isinstance(val, dict):
            print(val)
        else:    
            for val2 in val.values():
                print (val2)

The print statements in my original code above were there to simply show me what was being pulled from the list of dictionaries. What I wanted to be able to do is declare which values I am looking to grab from the top level and second level dictionaries and lists.

Here is what I am looking for as output as an example.

The value of the first key for each top level dictionary in the list.

top_level_dict_key1 = ['1','2']

All the values for the level 2 dictionaries.

level2_dic = ['12', '34', '56', '23', '45', '67']

Or specific values. In this case the value for the first key in each nested dictionary

level2_dict = ['12', '23']

value for the second key in the nested list

level2_list = ['test', 'test2']

Hopefully this is clear. I'll do my best to clarify if you need me too.

MixedBeans
  • 159
  • 4
  • 17
  • What version of Python? Before 3.7, dictionaries aren't guaranteed to have any particular order. –  Mar 15 '19 at 17:17
  • 1
    (Btw, good questions should be able to stand on their own. If you can edit it to make sense without referencing your previous question, it makes it that much easier for anyone trying to help you.) –  Mar 15 '19 at 17:18
  • Currently Python 3.6 but I could run an environment using 3.7 for this part of my project. The rest of the project is going to be deep learning so 3.7 is probably not a good idea for that. – MixedBeans Mar 15 '19 at 18:08
  • @JETM What I posted was pretty much all there was to my other question. I went back and edited this one a little bit but there isn't much in the wayof new information. Is there anything I am missing you need clarification on? I'd be glad to oblige. – MixedBeans Mar 15 '19 at 18:14
  • 1
    The references to your previous question are confusing. I can't see a clear question. Can you just remove all reference to having had this question before? – Blorgbeard Mar 15 '19 at 18:17
  • @Blorgbeard Edited – MixedBeans Mar 15 '19 at 19:22

1 Answers1

0

For a specific implementation of Python 3.6 dictionaries happen to be ordered, but it is not good to rely on this behavior. It's meaningless to ask about the "first element" of something unless it's ordered, so the first step is to read the JSON into an OrderedDict.

Then it's just a matter of careful bookkeeping. e.g.

import json                                                                     
from collections import OrderedDict                                             

dict_list = '[{ "a":"1", "b":"2", "c":"3", "d":{ "ab":"12","cd":"34", "ef":"56"}, "e":["test", "list"], "f":"etc..."}, { "a":"2", "b":"3", "c":"4", "d":{ "ab":"23"    ,"cd":"45", "ef":"67"}, "e":["test2", "list2"], "f":"etcx2..."}]'

dict_list = json.loads(dict_list, object_pairs_hook=OrderedDict)    
top_level_dict_key1 = []
level2_dic = []
level2_dict = []
level2_list = []
for dictionary in dict_list:
    top_level_dict_key1.append(list(dictionary.values())[0])
    for value in dictionary.values():
        if isinstance(value, OrderedDict):
            level2_dic.extend(list(value.values()))
            level2_dict.append(list(value.values())[0])
        elif isinstance(value, list):
            level2_list.append(value[0])

print(top_level_dict_key1)
print(level2_dic)
print(level2_dict)
print(level2_list)

Output:

['1', '2']
['12', '34', '56', '23', '45', '67']
['12', '23']
['test', 'test2']

(This is probably not the most idiomatic Python 3 code. I'll edit in something better when I'm less tired.)

  • While my sample data appears to be ordered the actual data is not and for this project it doesn't necessarily matter since there really isn't an order to the data but I see what you are saying. – MixedBeans Mar 15 '19 at 22:30
  • Just gave the code a try but I'm getting a `TypeError: 'dict_values' object does not support indexing` error. Is this a Python 3.6 issue? – MixedBeans Mar 16 '19 at 00:20
  • @MixedBeans Ah, my bad. I tested in 2.7. You'll have to cast the `.values()` to a list before you can index into them. Or see [this](https://stackoverflow.com/a/3097896/2201041) answer to avoid creating a list you don't need. –  Mar 16 '19 at 01:39
  • Thank. I gave this code a try and realized I forgot to use OrderedDict. I tried to implement it but I would get an error. `TypeError: 'object_pairs_hook' is an invalid keyword argument for this function`. Here is how I was loading the json prior. `with open('test.json', 'r') as f: json_text = f.read() dict_list = json.loads(json_text)` I tried dropping the with open and went with `dict_list = json.load(open('test.json'))` and that seems to load fine but when I add `object_pairs_hook=OrderedDict` as an argument to json.load, as I've seen in other examples, I get the TypeError – MixedBeans Mar 16 '19 at 14:18
  • @MixedBeans Er... double-check that you're passing it to `json.load` instead of `open`/spelling it correctly? I just double-checked the [docs](https://docs.python.org/3/library/json.html#json.load), and it's definitely in there. –  Mar 16 '19 at 14:58
  • If I use json.loads and pass in the variable or use json.load and directly pass in the json file they both work. Where things fall down is when I try to make it an ordered dictionary when I use the object_pairs_hook=OrderedDict argument. Tbhreading the definition of what object_pairs_hook does it doesn't sound like it would work but then again I'm still really new to Python so I'm probably missing some concept. – MixedBeans Mar 16 '19 at 16:13
  • Been in and out of the office all day. I just ran the code you have above and it works. No errors but level2_dic and level2_dict lists are both empty. Is this because I am not getting the data into and OrderedDict correctly? – MixedBeans Mar 16 '19 at 18:53
  • @MixedBeans Probably. I just edited my example to show reading your data into an `OrderedDict` and doing the same thing. –  Mar 16 '19 at 20:37