2

I am working on a project to parse an AWS Cloudformation Yaml File to extract all the !ImportValue from the YAML template.

I am trying to use ruamel.yaml to parse that (to which I am new), I was able to read the YAML file and get the individual elements.

import ruamel.yaml

def general_constructor(loader, tag_suffix, node):
  return node.value

ruamel.yaml.SafeLoader.add_multi_constructor(u'!', general_constructor)

with open(cfFile, 'r') as service:
  stream = service.read()

yaml_data = ruamel.yaml.safe_load(stream)
print yaml_data

Above code gets the content of specified YAML file and the output looks like following.

{'Application': {'Properties': {'ApplicationName': [ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'-'),
    SequenceNode(tag=u'tag:yaml.org,2002:seq', value=[ScalarNode(tag=u'tag:yaml.org,2002:str', value=u'***'), ScalarNode(tag=u'!ImportValue', value=u'jkl')])],
   *
   *
     ScalarNode(tag=u'!ImportValue', value=u'def'),
   *
   *
     ScalarNode(tag=u'!ImportValue', value=u'rst')])]},


So there are bunch of !ImportValue listed in ScalarNode (e.g ScalarNode(tag=u'!ImportValue', value=u'rst')), I actually want to extract that. Now these ImportValues are scattered in the template at various places. What would be the best way to extract the Value of those? In our cloudformation, we have bunch of YAML files, some of them Exports certain resource and other YAML files import them. So, I want to build a sort of dependency map (May be a JSON file) which will depict the interdependence between Cloud-formation files.

Pankaj Kolhe
  • 241
  • 1
  • 4
  • 20
  • Does [this](https://stackoverflow.com/a/55349491/3787051) provide you with a better option than writing all this yourself in Python? – Alex Harvey Apr 02 '19 at 11:21
  • This helps little bit, but just the reading part which exactly I stated in my first code snippet. Main ask here is extracting those !ImportValue which are scattered in YAML files. – Pankaj Kolhe Apr 02 '19 at 11:53

1 Answers1

1

If you use ruamel.yaml's round-trip loader you don't have to do anything special to load the tag, and walking recursively over the resulting data structure is relatively easy. The corresponding key needs to be passed on, as at least the first !ImportValue is within a sequence under the key.

Assuming an input.yaml consisting of:

Application:
  Properties:
    ApplicationName: ["-", ["**", !ImportValue "jkl"]]

  AnotherKey:
  - 42
  - nested: !ImportValue xyz

(which might not be exactly what you got as input, but will do for demonstration purposes), and using the new ruamel.yaml API (which defaults to round-trip loading/dumping):

import sys
from pathlib import Path
import ruamel.yaml

ta = ruamel.yaml.comments.Tag.attrib

yaml = ruamel.yaml.YAML()
data = yaml.load(Path('input.yaml'))

def process(d, key=None):
    if isinstance(d, dict):
        for k, v in d.items():
            for res in process(v, k):  # recurse and pass on new key
                yield res
    elif isinstance(d, list):
        for item in d:
            for res in process(item, key):
                yield res
    else:
       try:
           if getattr(d, ta, None).value == '!ImportValue':
               yield (key, d)
       except AttributeError:
           pass

for k, v in process(data):
   print(k, '->', v)

which gives:

ApplicationName -> jkl
nested -> xyz
Anthon
  • 69,918
  • 32
  • 186
  • 246
  • I see you updated your question to no longer require the key, you can delete that part of the code, or just ignore it in the final `for` loop. – Anthon Apr 02 '19 at 12:17