0

I have a script that requires certain keys from a JSON file. If any keys are missing or if the JSON file is corrupted (missing an end } or whatever) it will break. While trying to make a script that will open the JSON file and attempt to fix these error, I learned that .setdefault() does not set nested keys. I do not want to make nested for loops or make some crazy function that checks every key manually and so, after testing anything I could find on SO and still failing, I have come here. I have made a minimum reproducible example showing what should happen and what does happen. How can I get a function that will set all the required keys to be added for large dictionaries that are nested multiple times?

# NOTE: to run this, make a test.json file
import json

#test.json: {"dict":{"Key3": "Value1"}
# What should be the output of this script: {"dict":{"Key3": "Value1", "Key1": "Value1", "Key2": "Value2"}}

default_dict = {
    "dict":{
        "Key1": "Value1",
        "Key2": "Value2"
    }
}

# I want to make sure that all the keys in default_dict are present in test.json
# Any extra keys in test.json should be left alone

with open("test.json", "r+") as f:
    # First check if the file is a broken json file
    try:
        content: dict = json.load(f)
    except json.decoder.JSONDecodeError:
        # If it is, delete the file content and replace it with the default_dict
        # I would prefer a way to just fix the issue but I have no idea how that would be accomplished
        f.seek(0)
        f.truncate()
        f.write(json.dumps(default_dict, indent=4))
        content: dict = default_dict
    #Check that all keys are present
    for key in default_dict:
        content.setdefault(key, default_dict[key])
    # json dump
    #Delete the file content
    f.seek(0)
    f.truncate()
    json.dump(content, f, indent=4)


#Load the json file
with open("test.json", "r") as f:
    content: dict = json.load(f)
    print(content)

#Output (The exact same as before!):
"""
{'dict': {'Key3': 'Value1'}}
"""
TRCK
  • 231
  • 3
  • 12
  • You may get some ideas here: https://stackoverflow.com/questions/635483/what-is-the-best-way-to-implement-nested-dictionaries/19829714#19829714 – Barmar Apr 26 '23 at 15:06
  • @Barmar, while I understand what that post is saying, the JSON is bound and even forced to have nested dicts because of one of the API's that are used from that data. It also needs to be a human readable file (normal JSON is, IMO) as it will likely be messed with by users. – TRCK Apr 26 '23 at 15:10
  • I think you'll need nested loops. – Barmar Apr 26 '23 at 15:13
  • What would be the most efficient way to do it then and how will it work if you don't know how deeply nested the JSON is? – TRCK Apr 26 '23 at 15:14
  • If you don't know the nesting level, use a recursive function. – Barmar Apr 26 '23 at 15:21
  • This might help with broken JSON's: https://stackoverflow.com/a/18515887/17053202 – TRCK Apr 26 '23 at 15:21
  • A better approach may be to do a recursive merge of the JSON into the default. – Barmar Apr 26 '23 at 15:22
  • See https://stackoverflow.com/questions/7204805/how-to-merge-dictionaries-of-dictionaries – Barmar Apr 26 '23 at 15:23
  • Is the JSON really broken, or just missing some required keys? – Barmar Apr 26 '23 at 15:24
  • Sometimes its really broken, sometimes its missing keys, and with the merge, if the same key occurs twice, which is used? – TRCK Apr 26 '23 at 15:26
  • A key can't appear twice in a dictionary. `json.load()` will ignore one of the duplicates. – Barmar Apr 26 '23 at 15:29
  • I mean in the two dicts being merged, if they both have the same key – TRCK Apr 26 '23 at 15:40
  • The merge function should replace any existing keys with the value being merged. – Barmar Apr 26 '23 at 15:41
  • Hmmmmm, I don't want the necessary keys to override new values so I'll need to add an if to check if a value exists and keep the value if it does. It should only add the ones that don't exist. – TRCK Apr 26 '23 at 15:42
  • Why don't you want them to override? I thought these were just defaults in case the key is missing. If it's not missing in the JSON, you should use the JSON value. – Barmar Apr 26 '23 at 15:43
  • In case it's missing. If they've been modified by the user and they have changed a value I need to keep the value set by the user. – TRCK Apr 26 '23 at 15:44
  • Your code doesn't have anything about the user. There's just `default_dict` and `test.json`. The JSON file takes precedence. – Barmar Apr 26 '23 at 15:45
  • If there's a third source, you can do two merges. First override the defaults with the JSON file, then override that with the user-supplied values. – Barmar Apr 26 '23 at 15:46
  • Ok I think we're almost on the same page. The `test.json` can be modified by users of the application and therefore those keys take precedence but if the keys are missing they will be taken from `default_dict`. Does that make sense? – TRCK Apr 26 '23 at 15:48
  • Yes, that's exactly what I'm saying. You start with `default_dict`, then replace its elements with the corresponding elements from `test.json`. So the JSON takes precedence. – Barmar Apr 26 '23 at 15:49
  • And, this is less of actual coding and more preference, if the `test.json` is broken, how would you handle it to try and salvage as much data from the `test.json` as possible? – TRCK Apr 26 '23 at 15:51
  • There's no excuse for broken JSON, and no reliable way to recover from it. You linked to another question, I guess you could try something from that, but I recommend fixing the source. – Barmar Apr 26 '23 at 15:53
  • If you add all of this into an answer I will mark it. Thank you so much for your help! – TRCK Apr 26 '23 at 15:57

1 Answers1

1

Do the merge the other way. Start with default_dict as the value, then merge the JSON into it.

result = default_dict

with open("test.json") as f:
    try:
        new_content = json.load(f)
        merge_recursive(result, new_content)

You can find ways to implement merge_recursive() at How to merge dictionaries of dictionaries?. Some of the answers there will not override an existing key; either modify them to allow this, or find another answer.

Barmar
  • 741,623
  • 53
  • 500
  • 612