3

Suppose I have a YAML file as follows:

template:
 artifacts:
  config:
   a: value1
   b: value2
  jars:
   a: value1
   b: value2
  scripts:
   a: value1
   b: value2

I would like to have it like a tree as below:

template--
          |__artifacts__
                        |__config__
                        |          |__a__
                        |          |     |__value1
                        |          |
                        |          |__b__
                        |                |__value2
                        |__jars__ ...

How can I do that?

Anthon
  • 69,918
  • 32
  • 186
  • 246
Arijit Das
  • 49
  • 1
  • 2

1 Answers1

3

There are multiple parsers for YAML available for Python, but the only one supporting the latest YAML specification (1.2, released in 2009) is ruamel.yaml (disclaimer: I am the author of that package). The other packages (PySyck, PyYAML) also do not support loading of valid YAML constructs such as sequences/mappings as mapping keys. ruamel.yaml can be directed to dump YAML 1.1 for those outdated packages that only support that version of the YAML specification.

Nested python dicts can be used as a tree structure, with the keys a value of a node and the values that are non-dicts leaf nodes. This is the datastructure that is loaded from the mappings in your YAML file.

from pathlib import Path
from pprint import pprint
import ruamel.yaml

input = Path('input.yaml')
yaml = ruamel.yaml.YAML()
data = yaml.load(input)
pprint(data)

which gives:

{'template': {'artifacts': {'config': {'a': 'value1',
                                       'b': 'value2'},
                            'jars': {'a': 'value1',
                                     'b': 'value2'},
                            'scripts': {'a': 'value1',
                                        'b': 'value2'}}}}

This doesn't look like your expected output, nor are dicts really a tree structure. You can of course walk over your data-structure and create a tree of Nodes, but that is a bit backward, as you can tell the parser to directly create a Node when building the tree.

import sys
from ruamel.yaml.constructor import SafeConstructor

class Node:
    # your node definition here
    pass


class MyConstructor(SafeConstructor):
    def construct_yaml_map(self, node):
        data = Node()
        yield data
        res = self.construct_mapping(node)
        # and update data with the parsed data

MyConstructor.add_constructor('tag:yaml.org,2002:map', 
                              MyConstructor.construct_yaml_map)


yaml = ruamel.yaml.YAML()
yaml.Constructor = MyConstructor
data = yaml.load(input)

Please note that the above automatically deals with recursive structures in your YAML file, something not as easily realised when walking over the YAML loaded in the normal way.

Anthon
  • 69,918
  • 32
  • 186
  • 246