2

The python yaml package (version 5.1.2) is able to load the following file correctly, even though the list is not written with leading -

xx: [x1, x2]
yy: [y1, y2, y3]

The loading code is as follows

import yaml

with open('some file') as f:
    data = yaml.load(f, Loader=yaml.FullLoader)

This format is used in github actions config yaml files. For example,

on: [push, pull_request]

jobs:
  build:

    runs-on: ubuntu-latest
    strategy:
      matrix:
        python-version: [2.7, 3.5, 3.6, 3.7, 3.8]
        os: [ubuntu-16.04, ubuntu-18.04]
        node: [6, 8, 10]

But when I write data to file using yaml.dump(data, f), it takes the - convention, i.e.,

xx:
- x1
- x2
yy:
- y1
- y2
- y3

Is there a way to force it into the github-actions-like format?

I was told about default_flow_style, but it doesn't give exactly what I want.

yaml.dump({"A":[1,2,3],"B":[4,5,6]},default_flow_style=True)

The output is '{A: [1, 2, 3], B: [4, 5, 6]}\n'

nos
  • 19,875
  • 27
  • 98
  • 134
  • why don't you use json if you want json like format – deadshot May 24 '20 at 02:53
  • I don't want to use JSON. The format I posted is used in github actions config yaml files. I thought it's a common convention. Notice it's not the same as JSON. – nos May 24 '20 at 03:43
  • 2
    By default (or with `default_flow_style=None`) dumping into YAML file should produce `xx: [x1, x2]`. This is explicitely stated in the [pyYAML docs](https://pyyaml.org/wiki/PyYAMLDocumentation): "By default, PyYAML chooses the style of a collection depending on whether it has nested collections. If a collection has nested collections, it will be assigned the block style. Otherwise it will have the flow style." It seems that there is no ways for control flow style except `default_flow_style`. Anyway, if the file is treated as YAML, it should be read correctly both for "flow" and "block" styles. – Tsyvarev May 24 '20 at 08:45
  • Please make your comment into an answer, then I will accept it. Thanks! – nos May 24 '20 at 14:23

2 Answers2

4

As pointed out by @Tsyvarev my desired behavior can be triggered by

yaml.dump({"A":[1,2,3],"B":[4,5,6]}, default_flow_style=None)

The official documentation doesn't seem to define this None behavior though:

By default, PyYAML chooses the style of a collection depending on whether it has nested collections. If a collection has nested collections, it will be assigned the block style. Otherwise it will have the flow style.

If you want collections to be always serialized in the block style, set the parameter default_flow_style of dump() to False.

Community
  • 1
  • 1
nos
  • 19,875
  • 27
  • 98
  • 134
0

Generally, it is not possible to write out YAML exactly the way it was written when you loaded it, see this question.

You can follow the advice in the answer there: Load to node graph instead of native objects. It looks like this in PyYAML:

import yaml
import io

input = """
xx: [x1, x2]
yy: [y1, y2, y3]
"""

loader = yaml.Loader(input)
node = loader.get_single_node()

stream = io.StringIO()
dumper = yaml.Dumper(stream)
dumper.open()
dumper.serialize(node)
dumper.close()
print(stream.getvalue())

Output will be:

xx: [x1, x2]
yy: [y1, y2, y3]

This works because a node still remembers its original style (while the native data doesn't). It is still possible to alter the YAML structure, but you now need to create data as nodes instead of just manipulating the loaded Python data.

If you want to create your data in Python and dump in your preferred format, the easiest way to do that would probably be:

  • create the data
  • dump it to a YAML string
  • load that string as node graph
  • walk the node graph and alter the style attribute of the nodes to your liking
  • represent the node graph as YAML again
flyx
  • 35,506
  • 7
  • 89
  • 126