0

I am using yaml and pyyaml to configure my application.

Is it possible to configure something like this -

config.yml -

root:
    repo_root: /home/raghhuveer/code/data_science/papers/cv/AlexNet_lght
    data_root: $root.repo_root/data

service:
    root: $root.data_root/csv/xyz.csv

yaml loading function -

def load_config(config_path):
    config_path = os.path.abspath(config_path)
    
    if not os.path.isfile(config_path):
        raise FileNotFoundError("{} does not exist".format(config_path))
    else:
        with open(config_path) as f:
            config = yaml.load(f, Loader=yaml.SafeLoader)
        # logging.info(config)
        logging.info("Config used for run - \n{}".format(yaml.dump(config, sort_keys=False)))
        return DotDict(config)

Current Output-

root:
  repo_root: /home/raghhuveer/code/data_science/papers/cv/AlexNet_lght
  data_root: ${root.repo_root}/data

service:
  root: ${root.data_root}/csv/xyz.csv

Desired Output -

root:
  repo_root: /home/raghhuveer/code/data_science/papers/cv/AlexNet_lght
  data_root: /home/raghhuveer/code/data_science/papers/cv/AlexNet_lght/data

service:
  root: /home/raghhuveer/code/data_science/papers/cv/AlexNet_lght/data/csv/xyz.csv

Is this even possible with python? If so any help would be really nice.

Thanks in advance.

  • 1
    Does this answer your question https://stackoverflow.com/questions/1773805/how-can-i-parse-a-yaml-file-in-python – baduker Sep 06 '20 at 07:43
  • Nice task. I see no reason why it was not possible. Have you tried anything yourself already? How does real structure look like (how "deep" is the structure)? Are the variables always the "top-level" keys? Please provide more specific info to make the answer easier – Jan Stránský Sep 06 '20 at 07:48
  • Some ideas to get you started: You could use [template strings](https://docs.python.org/3/library/string.html#template-strings) and build the variable strings yourself, or you could use regular expressions. – Wups Sep 06 '20 at 08:17
  • @Wups I am a beginner with yaml configs and your comment just went over my head. Could you maybe dumb it down for me? – raghhuveer-jaikanth Sep 06 '20 at 08:56
  • @JanStránský I have edited the question to display my current method. This is as deep as I want to keep the yaml – raghhuveer-jaikanth Sep 06 '20 at 08:57
  • @raghhuveer-jaikanth this is independent of YAML, it is just file format. The same would hold for JSON, XML, ... – Jan Stránský Sep 06 '20 at 10:00
  • Are you going to use exclusively python? Then [this post](https://stackoverflow.com/questions/5484016/how-can-i-do-string-concatenation-or-string-replacement-in-yaml) may be useful – Jan Stránský Sep 06 '20 at 10:23
  • @JanStránský yep that link worked wonders. Thanks! – raghhuveer-jaikanth Sep 06 '20 at 11:00

1 Answers1

2

A general approach:

  • read the file as is
  • search for strings containing $:
    • determine the "path" of "variables"
    • replace the "variables" with actual values

An example, using recursive call for dictionaries and replaces strings:

import re, pprint, yaml

def convert(input,top=None):
    """Replaces $key1.key2 with actual values. Modifies input in-place"""
    if top is None:
        top = input # top should be the original input
    if isinstance(input,dict):
        ret = {k:convert(v,top) for k,v in input.items()} # recursively convert items
        if input != ret: # in case order matters, do it one or several times more until no change happens
            ret = convert(ret)
        input.update(ret) # update original input
        return input # return updated input (for the case of recursion)
    if isinstance(input,str):
        vars = re.findall(r"\$[\w_\.]+",input) # find $key_1.key_2.keyN sequences
        for var in vars:
            keys = var[1:].split(".") # remove dollar and split by dots to make "key chain"
            val = top # starting from top ...
            for k in keys: # ... for each key in the key chain ...
                val = val[k] # ... go one level down
            input = input.replace(var,val) # replace $key sequence eith actual value
        return input # return modified input
    # TODO int, float, list, ...

with open("in.yml") as f: config = yaml.load(f) # load as is
convert(config) # convert it (in-place)
pprint.pprint(config)

Output:

{'root': {'data_root': '/home/raghhuveer/code/data_science/papers/cv/AlexNet_lght/data',
          'repo_root': '/home/raghhuveer/code/data_science/papers/cv/AlexNet_lght'},
 'service': {'root': '/home/raghhuveer/code/data_science/papers/cv/AlexNet_lght/data/csv/xyz.csv'}}

Note: YAML is not that important here, would work also with JSON, XML or other formats.

Note2: If you use exclusively YAML and exclusively python, some answers from this post may be useful (using anchors and references and application specific local tags)

Jan Stránský
  • 1,671
  • 1
  • 11
  • 15