1

I have the following YAML:

instance:
  name: test
  flavor: x-large
  image: centos7

tasks:
  centos-7-prepare:
    priority: 1
    details::
      ha: 0
      args:
        template: &startup
          name: startup-centos-7
          version: 1.2
        timeout: 1800

  centos-7-crawl:
    priority: 5
    details::
      ha: 1
      args:
        template: *startup
        timeout: 0

The first task defines template name and version, which is then used by other tasks. Template definition should not change, however others especially task name will.

What would be the best way to change template name and version in Python?

I have the following regex for matching (using re.DOTALL):

template:.*name: (.*?)version: (.*?)\s

However did not figure out re.sub usage so far. Or is there any more convenient way of doing this?

Anthon
  • 69,918
  • 32
  • 186
  • 246
Rezney
  • 371
  • 1
  • 5
  • 21

2 Answers2

3

For this kind of round-tripping (load-modify-dump) of YAML you should be using ruamel.yaml (disclaimer: I am the author of that package).

If your input is in input.yaml, you can then relatively easily find the name and version under key template and update them:

import sys
import ruamel.yaml

def find_template(d):
    if isinstance(d, list):
        for elem in d:
            x = find_template(elem)
            if x is not None:
                return x
    elif isinstance(d, dict):
        for k in d:
            v = d[k]
            if k == 'template':
                if 'name' in v and 'version' in v:
                    return v
            x = find_template(v)
            if x is not None:
                return x
    return None


yaml = ruamel.yaml.YAML()
# yaml.indent(mapping=4, sequence=4, offset=2)
yaml.preserve_quotes = True

with open('input.yaml') as ifp:
    data = yaml.load(ifp)
template = find_template(data)
template['name'] = 'startup-centos-8'
template['version'] = '1.3'

yaml.dump(data, sys.stdout)

which gives:

instance:
  name: test
  flavor: x-large
  image: centos7

tasks:
  centos-7-prepare:
    priority: 1
    'details:':
      ha: 0
      args:
        template: &startup
          name: startup-centos-8
          version: '1.3'
        timeout: 1800

  centos-7-crawl:
    priority: 5
    'details:':
      ha: 1
      args:
        template: *startup
        timeout: 0

Please note that the (superfluous) quotes that I inserted in the input, as well as the comment and the name of the alias are preserved.

Anthon
  • 69,918
  • 32
  • 186
  • 246
  • Works great. Accepted – Rezney Oct 05 '18 at 09:19
  • Btw could you please provide any source or elaborate a little why it is preferred to use "ruamel.yaml"? Cheers... – Rezney Oct 05 '18 at 10:11
  • @Rezney The [documentation](https://yaml.readthedocs.io/en/latest/) has some of that information. In short: YAML 1.2 support, round-tripping without losing a lot of useful (for humans) information, single source leading to easier fixing of bugs (PyYAML has a lot of duplicate code in seperate sources for Python 2 resp. 3). A better API where unsafe loading has to be done explicitly – Anthon Oct 05 '18 at 11:49
  • Thanks. Is there any way to keep unused anchors? Or should I rather open a separate question for this... – Rezney Oct 10 '18 at 16:18
  • @Rezney No there is not. There already is an [open issue](https://bitbucket.org/ruamel/yaml/issues/64/) for that. – Anthon Oct 10 '18 at 20:34
2

I would parse the yaml file into a dictionary, and the edit the field and write the dictionary back out to yaml.

See this question for discussion on parsing yaml in python How can I parse a YAML file in Python but I think you would end up with something like this.

from ruamel.yaml import YAML
from io import StringIO

yaml=YAML(typ='safe')
yaml.default_flow_style = False

#Parse from string
myConfig = yaml.load(doc)
#Example replacement code
for task in myConfig["tasks"]:
    if myConfig["tasks"][task]["details"]["args"]["template"]["name"] == "&startup":
        myConfig["tasks"][task]["details"]["args"]["template"]["name"] = "new value"
#Convert back to string
buf = StringIO()
yaml.dump(myConfig, buf)
updatedYml = buf.getvalue()
Anthon
  • 69,918
  • 32
  • 186
  • 246
Evan Snapp
  • 523
  • 5
  • 21
  • I was thinking about dict but what if some key name changes? It is for automation... – Rezney Oct 04 '18 at 19:20
  • @Rezney So the name of the properties like `centos-7-crawl` can change? If that is the case you can look though each of the fields and check if the value is the one you want to change. – Evan Snapp Oct 04 '18 at 19:25
  • Yep, the one is most likely to change. I will try to think about your idea but wanted to avoid blind looping. Thanks anyway... – Rezney Oct 04 '18 at 19:30
  • Which package do you use? The accepted answer on the question you link to is using PyYAML. And I hope you don't recommend that, given your code. – Anthon Oct 04 '18 at 19:40
  • Thanks for catching that Anthon! I updated it to use your `ruamel.yaml` library. I had just been going for more general pseudo code. – Evan Snapp Oct 04 '18 at 19:51
  • 1
    @EvanSnapp I just wanted to make sure you were not using the potentially non-safe `yaml.load()` from PyYAML. So many answer propose that, without realising it can be unsafe, and that it is completely unnecessary to use that instead of `safe_load()`. – Anthon Oct 05 '18 at 11:55