0

I have a YAML file and I am trying to convert a Jinja2 template for it.

The problem I face is for extra quotes while dumping the YAML file.

Sample YAML content:

- XYZ: 'Hispanic'
  age: 43
  hobbies: ['cycling', 'skating']

- ABC: 'American Indian'
  age: 43
  hobbies: ['ice hockey']

I plan to convert it to jinja2 template such as:

- {{ name }}: "{% include 'ethnicity.jinja2' with context %}"
  age: 43
  hobbies: "{% include 'hobbies.jinja2' with context %}"

where hobbies.jinja2 will be:

{% if name == 'XYZ' %}['cycling', 'skating']{% endif %}
{% if name == 'ABC' %}['rafting', 'ice hockey']{% endif %}

and ethnicity with be:

{% if name == 'ABC' %}Hispanic{% endif %}
{% if name== 'XYZ' %}American Indian{% endif %}

While dumping the YAML contents, I get:

- '{{ name }}': "{% include ''ethnicity.jinja2'' with context %}"
   age: 43
   hobbies: "{% include 'hobbies.jinja2' with context %}"

How can I remove the unwanted quotes for non-alphabetic strings here: {{ name }} and 2 single quotes for ethnicity

In my use-case:

  1. I will have to include a template from within a template.
  2. I cannot change the input yaml.

And in doing so, another issue I face is hobbies become a string instead of a CommentedSequence.

ie. hobbies: "['cycling', 'skating']"

Anthon
  • 69,918
  • 32
  • 186
  • 246
iDev
  • 2,163
  • 10
  • 39
  • 64
  • If you don't want quotes around the sequence, you should not include quotes in your template: `hobbies: {% include 'hobbies.jinja2' with context %}` – Anthon Oct 06 '21 at 05:32

2 Answers2

0

TL:DR The YAML - {{ name }} is not valid Python because it uses a mapping/dictionary as a key for another dictionary. It is a compact way to write - { { name: null } : null } See @Anthon's answer for how to handle this.

At first glance, using ruamel.yaml (not the derivative ruamel_yaml package) to generate the following output

- {{ name }}: "{% include 'ethnicity.jinja2' with context %}"

does not seem possible because this is not valid YAML it specifies a mapping as a key to another mapping, which in Python would mean a dictionary would need to be the key of another dictionary, and this is not allowed because dictionaries are not hashable.

Given a file name template.yml that contains

- {{ name }}: "{% include 'ethnicity.jinja2' with context %}"
  age: 43
  hobbies: "{% include 'hobbies.jinja2' with context %}"

I tried creating a round trip like so

import io                                                                                                                                                                                                    [9/48]
from ruamel.yaml import YAML
import ruamel.yaml

def round_trip(path, yaml_type="safe"):
    yaml=YAML(typ=yaml_type)
    with open(path) as doc:
        yaml_obj = yaml.load(doc)

    yaml.default_flow_style = False
    yaml_fd = io.StringIO()
    yaml.dump(yaml_obj, yaml_fd)
    return yaml_fd.getvalue(), yaml_obj


## YAML types from https://sourceforge.net/p/ruamel-yaml/code/ci/default/tree/main.py#l53
# 'rt'/None -> RoundTripLoader/RoundTripDumper,  (default)
# 'safe'    -> SafeLoader/SafeDumper,
# 'unsafe'  -> normal/unsafe Loader/Dumper
# 'base'    -> baseloader
TYPES = ["safe", "rt", "base", "unsafe"]

PATH = "template.yml"

for t in TYPES:
    print(t, "==========\n", sep="\n")
    try:
        out, obj = round_trip(PATH, t)
    except ruamel.yaml.constructor.ConstructorError as err:
        print(str(err), end="\n\n")
        continue
    print(obj, end="\n\n")
    print(type(obj), end="\n\n")

which generated this output

safe
==========

while constructing a mapping
  in "template.yml", line 1, column 3
found unhashable key
  in "template.yml", line 1, column 4

rt
==========

[ordereddict([(ordereddict([(ordereddict([('name', None)]), None)]), "{% include 'ethnicity.jinja2' with context %}"), ('age', 43), ('hobbies', "{% include 'hobbies.jinja2' with context %}")])]

<class 'ruamel.yaml.comments.CommentedSeq'>

base
==========

while constructing a mapping
  in "template.yml", line 1, column 3
found unhashable key
  in "template.yml", line 1, column 4

unsafe
==========

while constructing a mapping
  in "template.yml", line 1, column 3
found unhashable key
  in "template.yml", line 1, column 4

Hence, using YAML(typ="rt") did not raise an error when parsing the input.

I tried recreating the output from scratch with

>>> fd = io.StringIO()
>>> yaml.dump(ordereddict([(CommentedKeyMap(ordereddict([(CommentedKeyMap(ordereddict(name=None)), None)])), "value")]), fd)
>>> print(fd.getvalue())
!!omap
- {{name: null}: null}: value

which showed that it was trying to interpret {{ name }} as an implicit mapping with default values of null. This is allowed in the YAML spec, as a mapping can be a key for another mapping object. However, this is not allowed in plain Python because a dictionary is not hashable and therefore cannot be the key in another dictionary. In otherwords, this is invalid

# Raises TypeError: unhashable type: 'dict'
map_of_maps = {{"name": "val1"}: "val2"}

ruamel.yaml gets around this by wrapping the mapping in a CommentedKeyMap class, which is hashable.

I also tried parsing your expected YAML file with the Online YAML Parser, but that tool only supports YAML 1.1 as pointed out by @Anthon in the comments. (It also gave an error).

ogdenkev
  • 2,264
  • 1
  • 10
  • 19
  • The package is `ruamel.yaml`, `ruamel_yaml` is a derivative made from an old version for an installer that cannot handle namespaces. Although `ruamel.yaml` cannot generate the output directly, it is has a pre- and post-processor for jinja2 templates of YAML documents that can. – Anthon Oct 06 '21 at 05:26
  • The online YAML parser you reference only supports YAML 1.1 which has been replaced more than a decade ago with YAML 1.2 – Anthon Oct 06 '21 at 05:27
0

If you are not sure if you can generate the output that you want with ruamel.yaml or don't know how to do approach that, it is always good to try and round-trip your YAML.

Of course a jinja2 template cannot be loaded without some pre-processing, as it isn't valid YAML, which is what you can achieve by following the answer here i.e. by installing ruamel.yaml.jinja2 and instantiating YAML(typ="jinja2"),

import sys
import ruamel.yaml

yaml_str = """\
- {{ name }}: "{% include 'ethnicity.jinja2' with context %}"
  age: 43
  hobbies: {% include 'hobbies.jinja2' with context %}
"""

yaml = ruamel.yaml.YAML(typ='jinja2')
yaml.preserve_quotes = True
data = yaml.load(yaml_str)
data[0]['age'] = 18
yaml.dump(data, sys.stdout)

which gives:

- {{ name }}: "{% include 'ethnicity.jinja2' with context %}"
  age: 18
  hobbies: {% include 'hobbies.jinja2' with context %}

The output equals the input, except for the expected change in the value for age. So ruamel.yaml, when used with the ruamel.yaml.jinja2 plugin, can handle this.

You should investigate what the data structure looks like, and you could try to generate that from scratch, but there is a caveat in that the YAML() instance keeps some information around about how the invalid jinja2 infused YAML document was transformed into a valid YAML document, which is used during dumping.

In general the following will not work when handling jinja2 templates for YAML, and adding the following:

yaml2 = ruamel.yaml.YAML(typ='jinja2')
yaml2.dump(data, sys.stdout)

will generate an AttributeError because that extra information is missing.

You can investigate the missing attribute, but that is internal information that can change without notice in a future version of ruamel.yaml.

If applicable the easiest way to go forward is to generate a minimal jinja2 template. You can use Python's % operator in order to circumvent having to escape { as you would have to do with .format() or f-strings, but you will have to escape jinja2's %. If you go with .format():

yaml_str = """\
- {{{{ {var00} }}}}: "{{% include '{var01}' with context %}}"
  hobbies: {{% include '{var02}' with context %}}
"""

data = yaml.load(yaml_str.format(var00='name', var01='ethnicity.jinja2', var02='hobbies.jinja2'))
data[0].insert(1, 'age', 43)
yaml.dump(data, sys.stdout)

giving output:

- {{ name }}: "{% include 'ethnicity.jinja2' with context %}"
  age: 43
  hobbies: {% include 'hobbies.jinja2' with context %}

The above does expansion before loading. You can recursively walk over the loaded data and do expansion after loading, on keys, values and elements that are strings. Some constructs (like jinja2 {% if are converted to comments, so these will not be as easy to change. Because of the interference of jinja2 using the characters {}% with both Python's % operator and .format() you should probably tag the values to be substituted with some seldom used special (unicode) codepoint instead of that and use a special replacement.

Anthon
  • 69,918
  • 32
  • 186
  • 246