0

I'm using this as a reference: Elegant way to remove fields from nested dictionaries

I have a large number of JSON-formatted data here and we've determined a list of unnecessary keys (and all their underlying values) that we can remove.

I'm a bit new to working with JSON and Python specifically (mostly did sysadmin work) and initially thought it was just a plain dictionary of dictionaries. While some of the data looks like that, several more pieces of data consists of dictionaries of lists, which can furthermore contain more lists or dictionaries with no specific pattern.

The idea is to keep the data identical EXCEPT for the specified keys and associated values.

Test Data:

to_be_removed = ['leecher_here']

easy_modo = 
{
    'hello_wold':'konnichiwa sekai',
    'leeching_forbidden':'wanpan kinshi',
    'leecher_here':'nushiyowa'
}

lunatic_modo = 
{
    'hello_wold': 
        {'
            leecher_here':'nushiyowa','goodbye_world':'aokigahara'
        },
    'leeching_forbidden':'wanpan kinshi',
    'leecher_here':'nushiyowa',
    'something_inside':
        {
            'hello_wold':'konnichiwa sekai',
            'leeching_forbidden':'wanpan kinshi',
            'leecher_here':'nushiyowa'
        },
    'list_o_dicts': 
        [
            {
                'hello_wold':'konnichiwa sekai',
                'leeching_forbidden':'wanpan kinshi',
                'leecher_here':'nushiyowa'
            }
        ]
}

Obviously, the original question posted there isn't accounting for lists.

My code, modified appropriately to work with my requirements.

from copy import deepcopy


def remove_key(json,trash):
    """
    <snip>
    """
    keys_set = set(trash)
    modified_dict = {}
    if isinstance(json,dict):
        for key, value in json.items():
            if key not in keys_set:
                if isinstance(value, dict):
                    modified_dict[key] = remove_key(value, keys_set)
                elif isinstance(value,list):
                    for ele in value:
                        modified_dict[key] = remove_key(ele,trash)
                else:
                    modified_dict[key] = deepcopy(value)
    return modified_dict

I'm sure something's messing with the structure since it doesn't pass the test I wrote since the expected data is exactly the same, minus the removed keys. The test shows that, yes it's properly removing the data but for the parts where it's supposed to be a list of dictionaries, it's only getting returned as a dictionary instead which will have unfortunate implications down the line.

I'm sure it's because the function returns a dictionary but I don't know to proceed from here in order to maintain the structure.


At this point, I'm needing help on what I could have overlooked.
fortesama
  • 65
  • 6

1 Answers1

0

When you go through your json file, you only need to determine whether it is a list, a dict or neither. Here is a recursive way to modify your input dict in place:

def remove_key(d, trash=None):
    if not trash: trash = []
    if isinstance(d,dict):
        keys = [k for k in d]
        for key in keys:
            if any(key==s for s in trash):
                del d[key]
        for value in d.values():
            remove_key(value, trash)
    elif isinstance(d,list):
        for value in d:
            remove_key(value, trash)

remove_key(lunatic_modo,to_be_removed)
remove_key(easy_modo,to_be_removed)

Result:

{
    "hello_wold": {
        "goodbye_world": "aokigahara"
    },
    "leeching_forbidden": "wanpan kinshi",
    "something_inside": {
        "hello_wold": "konnichiwa sekai",
        "leeching_forbidden": "wanpan kinshi"
    },
    "list_o_dicts": [
        {
            "hello_wold": "konnichiwa sekai",
            "leeching_forbidden": "wanpan kinshi"
        }
    ]
}

{
    "hello_wold": "konnichiwa sekai",
    "leeching_forbidden": "wanpan kinshi"
}
Henry Yik
  • 22,275
  • 4
  • 18
  • 40