1

I have different yaml files which may have different nested structure

file1.yaml:

test3:
  service1: 
      name1: |
        "somedata"
      name2: |
          "somedata"

file2.yaml:

test1: 
  app1: 
     app2:|
       "somedata"
  app7:
     key2: | 
       "testapp"

So as you can see structure of yaml files may be different.

The question is, may I somehow flexibly manage of appending some data to particular blocks of these files?

For example in file1 I want to write key vaue on the level of name1 and name 2 keys or service1:

test3:
  service1: 
      name1: |
        "somedata"
      name2: |
          "somedata"
      my-appended-key:| 
              "my appended value"
  my_second_appended_key: | 
          "my second appended valye"

and so on.

So the idea is to be able to specify under which nested block in yaml I want to append a data.

I have different yaml files which may have different nested structure

file1.yaml:

test3:
  service1: 
      name1: |
        "somedata"
      name2: |
          "somedata"

file2.yaml:

test1: 
  app1: 
     app2:|
       "somedata"
  app7:
     key2: | 
       "testapp"

So as you can see structure of yaml files may be different.

The question is, may I somehow flexibly manage of appending some data to particular blocks of these files?

For example in file1 I want to write key vaue on the level of name1 and name 2 keys or service1:

test3:
  service1: 
      name1: |
        "somedata"
      name2: |
          "somedata"
      my-appended-key:| 
              "my appended value"
  my_second_appended_key: | 
          "my second appended valye"

and so on.

For now I do it for specific case regarding structure of yaml file. Here is a part of my code:

import gnupg
import re
import argparse

def NewPillarFile():
    with open(args.sensitive) as sensitive_data:
        with open(args.encrypted, "w") as encrypted_result:
            encrypted_result.write('#!yaml|gpg\n\nsecrets:\n    '+args.service+':\n')
            for line in sensitive_data:
                encrypted_value = gpg.encrypt(re.sub(r'^( +?|[A-Za-z0-9]|[A-Za]|[0-9])+( +)?'+args.separator+'( +)?','',line,1), recipients=args.resident, always_trust=True)
                if not encrypted_value.ok:
                    print(encrypted_value.status, '\n', encrypted_value.stderr)
                    break
                line = re.sub(r'^( +)?','',line)
                encrypted_result.write('        '+re.sub(r'( +)?'+args.separator+'.*',': |',line))
                encrypted_result.write(re.sub(re.compile(r'^', re.MULTILINE), '            ', encrypted_value.data.decode())+'\n')

def ExistingPillarFile():
    with open(args.sensitive) as sensitive_data:
        with open(args.encrypted, "a") as encrypted_result:
            encrypted_result.write('    '+args.service+':\n')
            for line in sensitive_data:
                encrypted_value = gpg.encrypt(
                    re.sub(r'^( +?|[A-Za-z0-9]|[A-Za]|[0-9])+( +)?' + args.separator + '( +)?', '', line, 1),
                    recipients=args.resident, always_trust=True)
                if not encrypted_value.ok:
                    print(encrypted_value.status, '\n', encrypted_value.stderr)
                    break
                line = re.sub(r'^( +)?', '', line)
                encrypted_result.write('        ' + re.sub(r'( +)?' + args.separator + '.*', ': |', line))
                encrypted_result.write(re.sub(re.compile(r'^', re.MULTILINE), '            ', encrypted_value.data.decode())+'\n')

So the idea is to be able to specify under which nested block in yaml I want to append a data to make script more flexible.

user541
  • 17
  • 7
  • Can you show what you have tried already? I don't see a specific problem to do that; you would load the data into python, modify the data, and dump it back again into the file. If you show your existing code, it's easier to help – tinita Mar 18 '18 at 21:16
  • @tinita, added. – user541 Mar 19 '18 at 07:57

1 Answers1

2

You can use PyYAML's low-level event interface. Assuming you have an input YAML file and want to write the modifications to an output YAML file, you can write a function that goes through PyYAML's generated event stream and inserts the requested additional values at the specified locations:

import yaml
from yaml.events import *

class AppendableEvents:
  def __init__(self, path, events):
    self.path = path
    self.events = events

  def correct_position(self, levels):
    if len(self.path) != len(levels):
      return False
    for index, expected in enumerate(self.path):
      if expected != levels[index].cur_id:
        return False
    return True

class Level:
  def __init__(self, mode):
    self.mode = mode
    self.cur_id = -1 if mode == "item" else ""

def append_to_yaml(yamlFile, targetFile, items):
  events = []
  levels = []
  with open(yamlFile, 'r') as handle:
    for event in yaml.parse(handle):
      if isinstance(event, StreamStartEvent) or \
         isinstance(event, StreamEndEvent) or \
         isinstance(event, DocumentStartEvent) or \
         isinstance(event, DocumentEndEvent):
        pass
      elif isinstance(event, CollectionStartEvent):
        if len(levels) > 0:
          if levels[-1].mode == "key":
            # we can only handle scalar keys
            raise ValueError("encountered complex key!")
          else:
            if levels[-1].mode == "value":
              levels[-1].mode = "key"
        if isinstance(event, MappingStartEvent):
          levels.append(Level("key"))
        else: # SequenceStartEvent
          levels.append(Level("item"))
      elif isinstance(event, ScalarEvent):
        if len(levels) > 0:
          if levels[-1].mode == "item":
            levels[-1].cur_id += 1
          elif levels[-1].mode == "key":
            levels[-1].cur_id = event.value
            levels[-1].mode = "value"
          else: # mode == "value"
            levels[-1].mode = "key"
      elif isinstance(event, CollectionEndEvent):
        # here we check whether we want to append anything
        levels.pop()
        for item in items:
          if item.correct_position(levels):
            for additional_event in item.events:
              events.append(additional_event)
      events.append(event)
  with open(targetFile, mode="w") as handle:
    yaml.emit(events, handle)

To use it, you must provide the additional stuff you want to append as list of YAML events, and specify the desired position as list of keys (or sequence indexes):

def key(name):
  return ScalarEvent(None, None, (True, True), name)

def literal_value(content):
  return ScalarEvent(None, None, (False, True), content, style="|")

append_to_yaml("file1.yaml", "file1_modified.yaml", [
  AppendableEvents(["test3", "service1"], [
    key("my-appended-key"),
    literal_value("\"my appended value\"\n")]),
  AppendableEvents(["test3"], [
    key("my_second_appended_key"),
    literal_value("\"my second appended value\"\n")])])

This code correctly transform your file1.yaml into the given modified file. In general, this also allows you to append complex (sequence or mapping) nodes. Here's a basic example how to do that:

def seq(*scalarValues):
  return [SequenceStartEvent(None, None, True)] + \
    [ScalarEvent(None, None, (True, False), v) for v in scalarValues] + \
    [SequenceEndEvent()]

def map(*scalarValues):
  return [MappingStartEvent(None, None, True)] + \
    [ScalarEvent(None, None, (True, False), v) for v in scalarValues] + \
    [MappingEndEvent()]

append_to_yaml("file1.yaml", "file1_modified.yaml", [
  AppendableEvents(["test3", "service1"], [
    key("my-appended-key")] + seq("one", "two", "three")),
  AppendableEvents(["test3"], [
    key("my_second_appended_key")] + map("foo", "bar"))])
flyx
  • 35,506
  • 7
  • 89
  • 126
  • How about something easier? Or this solution is the easiest one? – user541 Mar 21 '18 at 11:52
  • This solution is what you do when the YAML implementation does not provide you with a way of transforming loaded DOMs (like e.g. XML does with XSLT / XPath). There is no comparable YAMLPath I am aware of, so you just implement it yourself. Applying some cleverness to `append_to_yaml` could surely make using it easier, but transcends the scope of an SO answer. If there was a way to tell if something is „the easiest“ solution, software development would be so much easier :), but I am pretty sure there is no one-liner you can use. – flyx Mar 21 '18 at 12:53
  • You could of course load the YAML file into a native dict-based structure and do the transformations there. However, this would make it much harder to specify the desired rendering of your values (literal vs plain scalars and so on) and may mess up the order of the mapping keys because they are parsed into a hashmap. If you do not care about these drawbacks, doing the transformation on dicts and serialising the result afterwards is probably „easier“. – flyx Mar 21 '18 at 13:00
  • thank you! I did some research and you are right this way is the simplest one for my case, but this approach supports only insertion of key and value, is here possible to do in addition insertion of key: subkey: value like in file2.yaml You can see app7 has subkey which is key2 and which contains value "testapp" there. – user541 Mar 24 '18 at 18:45
  • @user541 I updated the answer with a basic example for appending complex values. – flyx Mar 25 '18 at 09:51
  • thanks. I modified a string in map method to [ScalarEvent(None, None, (False, True), v, style="|") for v in scalarValues] + \ [MappingEndEvent()] But I get quatation for sub key: my_second_appended_key: "foo": | Is there a yaml way to get rid ot them with the help of pyyaml? I tried to play with different parameters in return for map method but didn't achieve a result. – user541 Mar 25 '18 at 13:31
  • You should ask a new question. I have difficulties following you and figuring out what you are doing. Basically, if you are unsure how to represent a certain structure as events, read an input YAML file with that structure with pyyaml and do `for event in yaml.parse(file): print(event)` to see how the structure looks as event stream. – flyx Mar 25 '18 at 18:10
  • Thank you very much for your efforts and help! – user541 Mar 25 '18 at 19:40
  • I created a separate question here: https://stackoverflow.com/questions/49490449/unnecessary-quotation-of-subkey-and-iteration-through-primary-key-in-pyyaml-even – user541 Mar 26 '18 at 11:43