6

What I want

From a yaml config I get a python dictionary that looks like this:

conf = {
    'cc0': {
        'subselect': 
            {'roi_spectra': [0], 'roi_x_pixel_spec': 'slice(400, 1200)'},
        'spec': 
            {'subselect': {'x_property': 'wavenumber'}},

        'trace': 
            {'subselect': {'something': 'jaja', 'roi_spectra': [1, 2]}}
    }
}

As you see the keyword 'subselect' is common to all sub level and its value is always a dict, but its existance is optional. The amount of nesting might change. I'm searching for a function, that allows me to do the following:

# desired function that uses recursion I belive.
collect_config(conf, 'trace', 'subselect')

where 'trace' is the key of a dict of dicts, with possibly a 'subselect' dict as value.

and It should return

{'subselect':{
    'something': 'jaja', 
    'roi_spectra': [1, 2], 
    'roi_x_pixel_spec': 
    'slice(400, 1200)'
}

or if I ask for

collect_config(conf, "spec", "subselect")

It should return

{'subselect':{
    'roi_spectra': [0], 
    'roi_x_pixel_spec': 
    'slice(400, 1200)',
    'x_property': 'wavenumber'
}

What I basically want, is a way of passing the values of a key from the top level down to the lower levels and have the lower levels overwrite the top level values. Much like inheritance for a class, but with a dictionary.

So I need a function that transverses the dict, finds a path to the desired key (here "trace" or "spec" and fills up its value (here "subselect") with the values of the higher levels, but only if the heigher level values are not existing.

A crappy solution

I currently have a kind of implementation that looks the following.

# This traverses the dict and gives me the path to get there as a list.
def traverse(dic, path=None):
    if not path:
        path=[]
    if isinstance(dic, dict):
        for x in dic.keys():
            local_path = path[:]
            local_path.append(x)
            for b in traverse(dic[x], local_path):
                 yield b
    else:
        yield path, dic

# Traverses through config and searches for the property(keyword) prop.
# higher levels will update the return
# once we reached the level of the name (max_depth) only 
# the path with name in it is of interes. All other paths are to
# be ignored.
def collect_config(config, name, prop, max_depth):
    ret = {}
    for x in traverse(config):
        path = x[0]
        kwg = path[-1]
        value = x[1]
        current_depth = len(path)
        # We only care about the given property.
        if prop not in path:
            continue
        if current_depth < max_depth or (current_depth == max_depth and name in path):
            ret.update({kwg: value})
    return ret

and I could then call it with

read_config(conf, "trace", 'subselect', 4)

and get

{'roi_spectra': [0],
 'roi_x_pixel_spec': 'slice(400, 1200)',
 'something': 'jaja'}

Update

jdehesa is almost there, but I could also have a config that looks like:

conf = {
    'subselect': {'test': 'jaja'}
    'bg0': {
      'subselect': {'roi_spectra': [0, 1, 2]}},
    'bg1': {
      'subselect': {'test': 'nene'}},
}

collect_config(conf, 'bg0', 'subselect')

{'roi_spectra': [0, 1, 2]} 

instead of

{'roi_spectra': [0, 1, 2], 'test': 'jaja'}
user3613114
  • 188
  • 8
  • So basically you want the dictionary named by the last argumnt (providing details), updated with overrides loaded from the path formed by second argument plus the last argument. – Martijn Pieters Aug 31 '17 at 09:02
  • How is the function supposed to figure out the key inside the top-level `conf` dictionary? In your example, there is only a single key, `cc0`. If there's always just a single key, what's the point? If there are sometimes more than one, which one should it take? – Sven Marnach Aug 31 '17 at 09:08
  • can you check if this can help you (https://stackoverflow.com/questions/9807634/find-all-occurrences-of-a-key-in-nested-python-dictionaries-and-lists) – Sujal Sheth Aug 31 '17 at 09:15
  • I have updated my solution to account for the case that you have added. – jdehesa Aug 31 '17 at 10:50
  • Care to reply to my question? If `conf` contains two dictionaries `"cc0"` and `"cc1"`, and both of them contain a key `"spec"`, which one do you expect to be returned by `collect_config(conf, "spec", "subselect")`? I honestly don't understand what exactly you want, and why you chose to use this data structure. – Sven Marnach Aug 31 '17 at 13:23
  • Hey Sven. Thanks for you Interest. Well I guess my Code is buggy there. It was rather meant as a clearification of what I want, because I found it really hard to express myself. Of course there could and will be multiple "cc0" anc "cc1" I just dumped it down to have something to start with. – user3613114 Sep 01 '17 at 08:26

1 Answers1

0

Here is my take:

def collect_config(conf, key, prop, max_depth=-1):
    prop_val = conf.get(prop, {}).copy()
    if key in conf:
        prop_val.update(conf[key].get(prop, {}))
        return prop_val
    if max_depth == 0:
        return None
    for k, v in conf.items():
        if not isinstance(v, dict):
            continue
        prop_subval = collect_config(v, key, prop, max_depth - 1)
        if prop_subval is not None:
            prop_val.update(prop_subval)
            return prop_val
    return None

conf = {
    'cc0': {
        'subselect': 
            {'roi_spectra': [0], 'roi_x_pixel_spec': 'slice(400, 1200)'},
        'spec': 
            {'subselect': {'x_property': 'wavenumber'}},

        'trace': 
            {'subselect': {'something': 'jaja', 'roi_spectra': [1, 2]}}
    }
}
print(collect_config(conf, "trace", 'subselect', 4))
>>> {'roi_x_pixel_spec': 'slice(400, 1200)',
     'roi_spectra': [1, 2],
     'something': 'jaja'}

conf = {
    'subselect': {'test': 'jaja'},
    'bg0': {
      'subselect': {'roi_spectra': [0, 1, 2]}},
    'bg1': {
      'subselect': {'test': 'nene'}},
}
print(collect_config(conf, 'bg0', 'subselect'))
>>> {'roi_spectra': [0, 1, 2], 'test': 'jaja'}

Leaving max_depth as -1 would traverse conf with no depth limit.

jdehesa
  • 58,456
  • 7
  • 77
  • 121