3

Recently I was trying to manage my docker-compose service configuration (namely docker-compose.yml) using ruamel.yaml.

I need to comment out & uncomment a service block when needed. Suppose I have the following file:

version: '2'
services:
    srv1:
        image: alpine
        container_name: srv1
        volumes:
            - some-volume:/some/path
    srv2:
        image: alpine
        container_name: srv2
        volumes_from:
            - some-volume
volumes:
    some-volume:

Is there some workaround to comment out the srv2 block? Like the following output:

version: '2'
services:
    srv1:
        image: alpine
        container_name: srv1
        volumes:
            - some-volume:/some/path
    #srv2:
    #    image: alpine
    #    container_name: srv2
    #    volumes_from:
    #        - some-volume
volumes:
    some-volume:

Moreover, is there a way to uncomment this block? (Suppose I have already hold the original srv2 block, I just need a method to delete these commented lines)

cherrot
  • 435
  • 1
  • 5
  • 13
  • Related: *[How do you do block comments in YAML?](https://stackoverflow.com/questions/2276572/how-do-you-do-block-comments-in-yaml/60238027#60238027)* – Peter Mortensen Dec 13 '22 at 22:56

2 Answers2

3

If srv2 is a key that is unique for all of the mappings in your YAML, then the "easy" way is to loop over de lines, test if de stripped version of the line starts with srv2:, note the number of leading spaces and comment out that and following lines until you notice a line that has equal or less leading spaces. The advantage of doing that, apart from being simple and fast is that it can deal with irregular indentation (as in your example: 4 positions before srv1 and 6 before some-volume).

Doing this on the using ruamel.yaml is possible as well, but less straightforward. You have to know that when round_trip_loading, ruamel.yaml normally attaches a comment to the last structure (mapping/sequence) that has been processed and that as a consequence of that commenting out srv1 in your example works completely different from srv2 (i.e. the first key-value pair, if commented out, differs from all the other key-value pairs).

If you normalize your expected output to four positions indent and add a comment before srv1 for analysis purposes, load that, you can search for where the comment ends up:

from ruamel.yaml.util import load_yaml_guess_indent

yaml_str = """\
version: '2'
services:
    #a
    #b
    srv1:
        image: alpine
        container_name: srv1
        volumes:
          - some-volume:/some/path
    #srv2:
    #    image: alpine
    #    container_name: srv2
    #    volumes_from:
    #      - some-volume
volumes:
    some-volume:
"""

data, indent, block_seq_indent = load_yaml_guess_indent(yaml_str)
print('indent', indent, block_seq_indent)

c0 = data['services'].ca
print('c0:', c0)
c0_0 = c0.comment[1][0]
print('c0_0:', repr(c0_0.value), c0_0.start_mark.column)

c1 = data['services']['srv1']['volumes'].ca
print('c1:', c1)
c1_0 = c1.end[0]
print('c1_0:', repr(c1_0.value), c1_0.start_mark.column)

which prints:

indent 4 2
c0: Comment(comment=[None, [CommentToken(), CommentToken()]],
  items={})
c0_0: '#a\n' 4
c1: Comment(comment=[None, None],
  items={},
  end=[CommentToken(), CommentToken(), CommentToken(), CommentToken(), CommentToken()])
c1_0: '#srv2:\n' 4

So you "only", have to create the first type comment (c0) if you comment out the first key-value pair and you have to create the other (c1) if you comment out any other key-value pair. The startmark is a StreamMark() (from ruamel/yaml/error.py) and the only important attribute of that instance when creating comments is column.

Fortunately this is made slightly easier then shown above, as it is not necessary to attach the comments to the "end" of the value of volumes, attaching them to the end of the value of srv1 has the same effect.

In the following comment_block expects a list of keys that is the path to the element to be commented out.

import sys
from copy import deepcopy
from ruamel.yaml import round_trip_dump
from ruamel.yaml.util import load_yaml_guess_indent
from ruamel.yaml.error import StreamMark
from ruamel.yaml.tokens import CommentToken


yaml_str = """\
version: '2'
services:
    srv1:
        image: alpine
        container_name: srv1
        volumes:
          - some-volume:/some/path
    srv2:
        image: alpine
        container_name: srv2  # second container
        volumes_from:
          - some-volume
volumes:
    some-volume:
"""


def comment_block(d, key_index_list, ind, bsi):
    parent = d
    for ki in key_index_list[:-1]:
        parent = parent[ki]
    # don't just pop the value for key_index_list[-1] that way you lose comments
    # in the original YAML, instead deepcopy and delete what is not needed
    data = deepcopy(parent)
    keys = list(data.keys())
    found = False
    previous_key = None
    for key in keys:
        if key != key_index_list[-1]:
            if not found:
                previous_key = key
            del data[key]
        else:
            found = True
    # now delete the key and its value
    del parent[key_index_list[-1]]
    if previous_key is None:
        if parent.ca.comment is None:
            parent.ca.comment = [None, []]
        comment_list = parent.ca.comment[1]
    else:
        comment_list = parent[previous_key].ca.end = []
        parent[previous_key].ca.comment = [None, None]
    # startmark can be the same for all lines, only column attribute is used
    start_mark = StreamMark(None, None, None, ind * (len(key_index_list) - 1))
    for line in round_trip_dump(data, indent=ind, block_seq_indent=bsi).splitlines(True):
        comment_list.append(CommentToken('#' + line, start_mark, None))

for srv in ['srv1', 'srv2']:
    data, indent, block_seq_indent = load_yaml_guess_indent(yaml_str)
    comment_block(data, ['services', srv], ind=indent, bsi=block_seq_indent)
    round_trip_dump(data, sys.stdout,
                    indent=indent, block_seq_indent=block_seq_indent,
                    explicit_end=True,
    )

which prints:

version: '2'
services:
    #srv1:
    #    image: alpine
    #    container_name: srv1
    #    volumes:
    #      - some-volume:/some/path
    srv2:
        image: alpine
        container_name: srv2  # second container
        volumes_from:
          - some-volume
volumes:
    some-volume:
...
version: '2'
services:
    srv1:
        image: alpine
        container_name: srv1
        volumes:
          - some-volume:/some/path
    #srv2:
    #    image: alpine
    #    container_name: srv2      # second container
    #    volumes_from:
    #      - some-volume
volumes:
    some-volume:
...

(the explicit_end=True is not necessary, it is used here to get some demarcation between the two YAML dumps automatically).

Removing the comments this way can be done as well. Recursively search the comment attributes (.ca) for a commented out candidate (maybe giving some hints on where to start). Strip the leading # from the comments and concatenate, then round_trip_load. Based on the column of the comments you can determine where to attach the uncommented key-value pair.

Anthon
  • 69,918
  • 32
  • 186
  • 246
  • my sample output is strictly indented by 4 spaces, it's weird why it prints 6 spaces in your browser. – cherrot May 08 '17 at 05:42
  • @cherrot It is not, before `some-volume:` there are 6 indents with the dash offset 4 in them (that is the block sequence indent). That is all of course how you count, but a sequence element like this `- a` is counted indented 2 with offset 0. That is the `s` of `some-volume` is 6 columns farther than the `v` of `volumes`, and that is counted as 6 indents – Anthon May 08 '17 at 06:37
  • @cherrot This is not something I came up with, it is a result of PyYAML having only one "indent" control for mappings and sequences in which the dash doesn't count. I once looked into splitting that into two parameters for ruamel.yaml but ran into multiple problems. Adding `block-sequence-indent` was the best I could do for now. – Anthon May 08 '17 at 07:33
  • I got it. Thanks for your explanation @anthon! – cherrot May 08 '17 at 08:19
2

Add uncomment_block function inspired by @Anthon's answer, and some enhancements for comment_block:

from copy import deepcopy
from ruamel.yaml import round_trip_dump, round_trip_load
from ruamel.yaml.error import StreamMark
from ruamel.yaml.tokens import CommentToken


def comment_block(root, key_hierarchy_list, indent, seq_indent):
    found = False
    comment_key = key_hierarchy_list[-1]
    parent = root
    for ki in key_hierarchy_list[:-1]:
        parent = parent[ki]
    # don't just pop the value for key_hierarchy_list[-1] that way you lose comments
    # in the original YAML, instead deepcopy and delete what is not needed
    block_2b_commented = deepcopy(parent)
    previous_key = None
    for key in parent.keys():
        if key == comment_key:
            found = True
        else:
            if not found:
                previous_key = key
            del block_2b_commented[key]

    # now delete the key and its value, but preserve its preceding comments
    preceding_comments = parent.ca.items.get(comment_key, [None, None, None, None])[1]
    del parent[comment_key]

    if previous_key is None:
        if parent.ca.comment is None:
            parent.ca.comment = [None, []]
        comment_list = parent.ca.comment[1]
    else:
        comment_list = parent[previous_key].ca.end = []
        parent[previous_key].ca.comment = [None, None]

    if preceding_comments is not None:
        comment_list.extend(preceding_comments)

    # startmark can be the same for all lines, only column attribute is used
    start_mark = StreamMark(None, None, None, indent * (len(key_hierarchy_list) - 1))
    skip = True
    for line in round_trip_dump(block_2b_commented, indent=indent, block_seq_indent=seq_indent).splitlines(True):
        if skip:
            if not line.startswith(comment_key + ':'):
                continue
            skip = False
        comment_list.append(CommentToken('#' + line, start_mark, None))

    return False


def uncomment_block(root, key_hierarchy_list, indent, seq_indent):
    '''
    FIXME: comments may be attached to the parent's neighbour
    in document like the following. (srv2 block is attached by volumes, not servies, not srv1).
    version: '2'
       services:
           srv1: foobar
           #srv2:
           #    image: alpine
           #    container_name: srv2
           #    volumes_from:
           #        - some-volume
       volumes:
           some-volume:
    '''
    found = False
    parent = root
    commented_key = key_hierarchy_list[-1]
    comment_indent = indent * (len(key_hierarchy_list) - 1)
    for ki in key_hierarchy_list[:-1]:
        parent = parent[ki]

    if parent.ca.comment is not None:
        comment_list = parent.ca.comment[1]
        found, start, stop = _locate_comment_boundary(comment_list, commented_key, comment_indent)

    if not found:
        for key in parent.keys():
            bro = parent[key]
            while hasattr(bro, 'keys') and bro.keys():
                bro = bro[bro.keys()[-1]]

            if not hasattr(bro, 'ca'):
                continue

            comment_list = bro.ca.end
            found, start, stop = _locate_comment_boundary(comment_list, commented_key, comment_indent)

    if found:
        block_str = u''
        commented = comment_list[start:stop]
        for ctoken in commented:
            block_str += ctoken.value.replace('#', '', 1)
        del(comment_list[start:stop])

        block = round_trip_load(block_str)
        parent.update(block)
    return found


def _locate_comment_boundary(comment_list, commented_key, comment_indent):
    found = False
    start_idx = 0
    stop_idx = len(comment_list)
    for idx, ctoken in enumerate(comment_list):
        if not found:
            if ctoken.start_mark.column == comment_indent\
                    and ctoken.value.replace('#', '', 1).startswith(commented_key):
                found = True
                start_idx = idx
        elif ctoken.start_mark.column != comment_indent:
            stop_idx = idx
            break
    return found, start_idx, stop_idx


if __name__ == "__main__":
    import sys
    from ruamel.yaml.util import load_yaml_guess_indent

    yaml_str = """\
version: '2'
services:
    # 1 indent after services
    srv1:
        image: alpine
        container_name: srv1
        volumes:
          - some-volume
        # some comments
    srv2:
        image: alpine
        container_name: srv2  # second container
        volumes_from:
          - some-volume
        # 2 indent after srv2 volume
# 0 indent before volumes
volumes:
    some-volume:
"""

    for srv in ['srv1', 'srv2']:
        # Comment a service block
        yml, indent, block_seq_indent = load_yaml_guess_indent(yaml_str)
        comment_block(yml, ['services', srv], indent=indent, seq_indent=block_seq_indent)
        commented = round_trip_dump(
            yml, indent=indent, block_seq_indent=block_seq_indent, explicit_end=True,
        )
        print(commented)

        # Now uncomment it
        yml, indent, block_seq_indent = load_yaml_guess_indent(commented)
        uncomment_block(yml, ['services', srv], indent=indent, seq_indent=block_seq_indent)

        round_trip_dump(
            yml, sys.stdout, indent=indent, block_seq_indent=block_seq_indent, explicit_end=True,
        )

Output:

version: '2'
services:
    # 1 indent after services
    #srv1:
    #    image: alpine
    #    container_name: srv1
    #    volumes:
    #      - some-volume
    #        # some comments
    srv2:
        image: alpine
        container_name: srv2  # second container
        volumes_from:
          - some-volume
        # 2 indent after srv2 volume
# 0 indent before volumes
volumes:
    some-volume:
...

version: '2'
services:
    # 1 indent after services
    srv2:
        image: alpine
        container_name: srv2  # second container
        volumes_from:
          - some-volume
        # 2 indent after srv2 volume
# 0 indent before volumes
    srv1:
        image: alpine
        container_name: srv1
        volumes:
          - some-volume
        # some comments
volumes:
    some-volume:
...
version: '2'
services:
    # 1 indent after services
    srv1:
        image: alpine
        container_name: srv1
        volumes:
          - some-volume
        # some comments
    #srv2:
    #    image: alpine
    #    container_name: srv2      # second container
    #    volumes_from:
    #      - some-volume
    #        # 2 indent after srv2 volume
    ## 0 indent before volumes
volumes:
    some-volume:
...

version: '2'
services:
    # 1 indent after services
    srv1:
        image: alpine
        container_name: srv1
        volumes:
          - some-volume
        # some comments
    srv2:
        image: alpine
        container_name: srv2  # second container
        volumes_from:
          - some-volume
        # 2 indent after srv2 volume
# 0 indent before volumes
volumes:
    some-volume:
...
cherrot
  • 435
  • 1
  • 5
  • 13