0

I need to extract a single float from 100s or 1000s of .json files (which I am not familiar with, and do not have control over their creation), and use them later in my code. Relevant excerpt with shortened names:

...
    "a": {
        "b": {
            "variable_name": {
                "known_key": 133.2982,
                ...

There are multiple additional keys at the same level of "a", "b", and "known_key". I have no way of knowing what "variable_name" will be before accessing the file and do not need to track it in anyway. I do know that it will be the only key at that level of the dictionary and that it is almost guaranteed to not be unique among the different .json files.

using this answer I was able to determine that I could generically access the "variable_name" key by repeating the entire dictionary structure to that point and using .keys()[0] but it feels like there should be a better way of doing this?

with open("json_file_X.json", "r") as j_in:
        data = json.load(j_in)
        needed = data["a"]["b"][list(data["a"]["b"].keys())[0]]["known_key"]

#do downstream stuff with needed float value after closing .json file

I know that I could substitute the following 2 line for loop for the 'needed' line above, but this seems wrong because someone else looking at this code would think I'm iterating over all the keys and only keeping the last value.

for var_key in data["a"]["b"]:
    needed = data["a"]["b"][var_key]["known_key"]

So that leaves me specifically interested in a way to simplify [list(data["a"]["b"].keys())[0]] given that I know there is only 1 key at that level or wondering if i'm going about the .json file structure completely wrong given that I only need 1 value out of the entire file.

ded
  • 420
  • 2
  • 13

1 Answers1

0

list(data["a"]["b"].keys())[0] can be "simplified" to list(data['a']['b'])[0], but it's not much of a simplification.

I'm guessing the reason these JSON files are formatted this was is that variable_name is something that is either unique or changing a lot, like a username or timestamp, and you want to know its value. If you are able to change the JSON format at all, here are two formats which would still give you access to variable_name while making it easier to get your float value:

1)

"a": {
    "b": {
        "NAME": "variable_name",
        "known_key": 133.2982,
        ...
        "another_key": 4545.234
         }
     }

You can get variable_name by calling data['a']['b']['NAME'], and get your float value by calling data['a']['b'][known_key], without needing to figure out what variable_name is.

2)

"META": {
    "NAME": "variable_name"
    },
"DATA": {
    "a": {
        "b": {
            "known_key": 133.2982,
            ...
            "another_key": 4545.234
             }
         }
}

You can get variable_name by calling data['META']['NAME'], and again get your float value by calling data['a']['b'][known_key], without needing to figure out what variable_name is.

If you can't change the format, can you change the JSON filenames? Because then you could have each filename be json_file.variable_name.json, so your variable_name is encoded in the filename. Then access data['a']['b']['variable_name'] like:

for fname in ['json_file.X.json', 'json_file.Y.json', ...]:
    with open(fname, "r") as j_in:
        data = json.load(j_in)
        var_name = fname.split('.')[1]
        needed = data["a"]["b"][var_name]["known_key"]
        print(fname, var_name, needed)
SkippyElvis
  • 100
  • 6
  • Unfortunately, the json file won't change, and having the variable_named key is correct as the program CAN have multiple keys at that level, but i know based on the inputs that i give the program there will only be 1 key. Additionally knowing what inputs i give does not tell me what variable_key will be, so can't use that info either. thanks for the catch on not needing `.keys()` holdover from python2 – ded Aug 15 '19 at 14:39