-1

I am trying to accomplish the same goal @Jannik was here: Python: Getting all values of a specific key from json

I'm attempting to use the same solution by I don't know what to put in the result[''] because my results aren't labeled with a parent title like 'ABC'. How can I pull all the values for the same key listed throughout my JSON data?

Here's my JSON 'list':

result = [
   {
      "created_at":"Wed Oct 13 19:01:18 +0000 2021",
      "id":1448363231366590471,
      "id_str":"1448363231366590471",
      "text":"The editor of @SpaceRef and @NASAWatch - @KeithCowing - will be live on @DeutscheWelle to talk about the… https://example.com",
      "truncated":true,
      "entities":{
         "hashtags":[],
         "symbols":[],
         "user_mentions":[
            {
               "screen_name":"SpaceRef",
               "name":"SpaceRef",
                   "id":8623332,
                   "id_str":"8623332",
                   "indices":[]   
            }
           ]
      "retweet_count": 1,
      "favorite_count": 3
   }
]

I am attempting to pull all instances of keys 'retweet_count' and 'favorite_count' (not shown in JSON data) and their corresponding values.

The following results in the TypeError: list indices must be integers or slices, not str:

def get_val(key):
  return [entry[key] for entry in result[''].values()]
  
print(get_val('retweet_count'))

Here's how I've accomplished getting them one-by-one, by it's not practical for the amount I need:

result = requests.get(url,headers={'Content-Type':'application/json','Authorization': 'Bearer {}'.format(bearer_token)}).json()

retweets1 = result[0]['retweet_count']
likes1 = result[0]['favorite_count']
retweets2 = result[1]['retweet_count']
likes2 = result[1]['favorite_count']
retweets3 = result[2]['retweet_count']
likes3 = result[2]['favorite_count']
retweets4 = result[3]['retweet_count']
likes4 = result[3]['favorite_count']

etc...

print(retweets1, retweets2, etc...)
VRapport
  • 61
  • 1
  • 7
  • Please make the json readable NOT on a single line, and just show a couple of entries. – DisappointedByUnaccountableMod Oct 13 '21 at 21:03
  • 1
    I meant a couple of entries in the list. Please make the JSON a _valid_ JSON string, and show an example of `retweet_count` And `favorite_count` because how is anyone supposed to help you find the key/value you want when the key isn’t shown in your question? You can remove the keys/values that don’t matter as long as you keep the nesting of dictionaries/lists the same as your real data. If the specific data is confidential replace it by representative short nonsense strings/numbers. Basicvally, make the content in your question minimal and representative. – DisappointedByUnaccountableMod Oct 13 '21 at 21:17

2 Answers2

1

As you mentioned above unlike Python: Getting all values of a specific key from json solution, your result data as json does not have key/values on its first layer, it is list of json objects so you must iterate over this list then use your target keys on each item. On first step lets simplify your json (update your question if the simplified version is incorrect).

So lets say your json data has this scheme:

result = [
    {
        # Others keys ...
        "retweet_count": 1,
        "favorite_count": 2
    },

    {
        # Other Keys ...
        "retweet_count": 3,
        "favorite_count": 4
    },

    {
        # Other Keys ...
        "retweet_count": 5,
        "favorite_count": 6
    }
    # Other items ...
]

Now we must iterate over this list by using for/while loops in python (in this case for loop is better):

def get_values(get_keys):
    target_values = []
    for each_item in result:
        for each_key in get_keys:
            if each_key in each_item:
                target_values.append(each_item[each_key])
    return target_values

If you need it in list comprehension style or one-liner for loop style use this one:

def get_values(get_keys):
    target_values = [each_item[each_key] for each_item in result for each_key in get_keys if each_key in each_item]
    return target_values

Then you just need to call this function and pass your target keys as a list to it:

if __name__ == '__main__':
    output = get_values(["retweet_count", "favorite_count"])
    print(output)

It will return your target values as single flat list like this:

[1, 2, 3, 4, 5, 6]

But if you need the values of each item separately you can use this version:

def get_values(get_keys):
    total_values = []
    for each_item in result:
        each_item_values = []
        for each_key in get_keys:
            each_item_values.append(each_item.get(each_key, None))
        total_values.append(each_item_values)
    return total_values

It will return a list of lists of each item values:

[[1, 2], [3, 4], [5, 6]]

If your target key does not exist it will return None value, for example passing an extra key:

if __name__ == '__main__':
    output = get_values(["retweet_count", "hello, how are you?", "favorite_count"])
    print(output)

Output:

[[1, None, 2], [3, None, 4], [5, None, 6]]

Depends on your json scheme and its size you can change your searching algorithm; make it faster, shorter in lines with better performance for memory.

Note: If your json data is so complex with too many nested layers or has massive size you must use json-based databases like mongodb which also has python-wrapper. By using them you can search and catch any kind of values or complex search pattern much faster than hard coding and also with memory efficiency in few lines of code.

Check this out: https://www.w3schools.com/python/python_mongodb_getstarted.asp

DRPK
  • 2,023
  • 1
  • 14
  • 27
  • 1
    Thanks @DRPK - All of these solutions work great. Thanks for bringing my attention to MongoDB - Thankfully, I don't need it now, but it's nice to know it's there in case I need it in the future! – VRapport Oct 14 '21 at 11:56
0

You might try https://pypi.org/project/wildpath/:

To do the above:

path = WildPath(“*.retweet_count|favorite_count”)

result = path.get_in(json_blob)

Lars
  • 1,869
  • 2
  • 14
  • 26