-1

I am somewhat a beginner in python and I am trying to search for a specific keyword in a Json file. I have been reading up on how dictionaries and lists work in python and I came across this:

complex_list = [["a",["b",["c","x"]]],42]
complex_list[0][1]
output: ['b', ['c', 'x']]
complex_list[0][1][1][0]
output: 'c'

As I understand, in complex_list[0][1], the [0] is the entire bracket [, ], and [1] accesses the second part of the bracket: [, [this one] ].

Now, this one: complex_list = [["a",["b",["c","x"]]],42], has 2 elements within the list correct? a, b, c ,and x belong to one set and 42 belongs to the second set. I don't know how to interpret this: complex_list[0][1][1][0] to access 'c'.

Could someone break it down please? I ask this because I think this is what I need to use to solve the problem I explain below.

This is a small sample from the file I am working with at the moment:

{ (white)

  "results": [
    { (black)
      "Fruit": "Apple",
      "Nested fruit": [
        "Orange"
      ],
      "Title1": "Some text",
      "Contents": { (yellow)
        "Name 1": [
          "John Smith"
        ],
        "Name 2": [
          "Tyler"
        ],
        "Name 3": [
          "Bob",
          "Rob"
        ],
        "Name 4": [
          "Linda"
        ],
        "Name 5": [
          "Mark",
          "Matt"
        ],
        "Some boolean": [
          true
        ]
      }, (yellow)

      "More stuff": "More random text",
      "Confusing": [
        { (red)
          "Some info": "456",
          "Info I want": "849456"
        } (red)
      ],
      "Not important": [
        { (blue)
          "random text": "bla",
          "random text2": "bla bla"
        } (blue)
      ],
      "Not important 2": "000",
      "Not important3": [
        "whatever",
        "whatever"
      ],
      "Not important 4": "16",
      "Not important 5": "0058"
    } (black)
  ]
} (white)

I have put colors in parenthesis next to their corresponding curly braces so that it is easy to distinguish. Following some examples online, I found:

import json

    with open('searchingKeywords.json') as f:
        data = json.load(f)
    print(data.keys())

    for k in data:
        for v in data[k]:
            if 'More stuff' in v:
                print("yes")

which prints:

dict_keys(['results'])
yes

There is only 1 key, but what about Contents? Isn't that another key within results? I am so confused. What I am interested in is "info I want" inside "Confusing". How do I search inside so many nested things if keyword "Info I want" is contained? Initially, I tried reading line by line-- once I parsed the Json file into a Python object-- and then see if a keyword "Info I want" is found in each line but I kept getting errors. Additionally, the file I am working with is huge and "Info I want" may be nested differently.

LtWorf
  • 7,286
  • 6
  • 31
  • 45
Kinozato
  • 13
  • 1
  • 6
  • 1
    Does this answer your question? [recursive iteration through nested json for specific key in python](https://stackoverflow.com/questions/21028979/recursive-iteration-through-nested-json-for-specific-key-in-python) – Maurice Meyer Mar 14 '20 at 17:43
  • 1
    complex_list is a list with two elements. Remember that Python starts counting at 0, so the first element (element 0) in the list is ["a",["b",["c","x"]]] and the second element (element 1) is 42. Now, the first element is also a list with two elements: "a" and ["b",["c","x"]], with the second element being a list with two elements: "b" and ["c","x"]. Finally, the second element is a list with two elements: "c" and "x". So, we have complex_list[0][0]: "a", complex_list[0][1] [0]: "b", complex_list[0][1] [1] [0]: "c", complex_list[0][1] [1] [1]: "x" and complex_list[1]: 42. – matnor Mar 14 '20 at 18:05
  • @Maurice Meyer It did not help but thank you. – Kinozato Mar 14 '20 at 18:10
  • Thanks @matnor. I get it now. – Kinozato Mar 14 '20 at 18:14

1 Answers1

1

As mentioned in the comments, the not accepted answer in the linked question works perfectly fine for you case:

data = {
  "results": [
    {
      "Fruit": "Apple",
      "Nested fruit": [
        "Orange"
      ],
      "Title1": "Some text",
      "Contents": {
        "Name 1": [ 
          "John Smith"
        ],
        "Name 2": [
          "Tyler"
        ],
        "Name 3": [
          "Bob",
          "Rob"
        ],
        "Name 4": [
          "Linda"
        ],
        "Name 5": [
          "Mark",
          "Matt"
        ],
        "Some boolean": [
          True
        ]
      },
      "More stuff": "More random text",
      "Confusing": [
        {
          "Some info": "456",
          "Info I want": "849456"
        }
      ],
      "Not important": [
        {
          "random text": "bla",
          "random text2": "bla bla"
        }
      ],
      "Not important 2": "000",
      "Not important3": [
        "whatever",
        "whatever"
      ],
      "Not important 4": "16",
      "Not important 5": "0058"
    }
  ]
}


def item_generator(json_input, lookup_key):
    if isinstance(json_input, dict):
        for k, v in json_input.items():
            if k == lookup_key:
                yield v
            else:
                yield from item_generator(v, lookup_key)
    elif isinstance(json_input, list):
        for item in json_input:
            yield from item_generator(item, lookup_key)


res = item_generator(data, 'More stuff')
print([x for x in res])

res = item_generator(data, 'Info I want')
print([x for x in res])

Output:

['More random text']
['849456']
Maurice Meyer
  • 17,279
  • 4
  • 30
  • 47
  • Mmm it did not work for me because I tried to understand that code and modified it based on how I understood it. I will try it again and then I’ll post an update, thanks! – Kinozato Mar 15 '20 at 18:37
  • Thank you @Maurice Meyer. I never copy and paste code because I always try to understand what is happening first. I was reading up all of the answers in the link you provided and tried to do my own code. I am going to analyze this code now and see how it works :D Thanks again! – Kinozato Mar 15 '20 at 18:46