102

I'm using yaml.dump to output a dict. It prints out each item in alphabetical order based on the key.

>>> d = {"z":0,"y":0,"x":0}
>>> yaml.dump( d, default_flow_style=False )
'x: 0\ny: 0\nz: 0\n'

Is there a way to control the order of the key/value pairs?

In my particular use case, printing in reverse would (coincidentally) be good enough. For completeness though, I'm looking for an answer that shows how to control the order more precisely.

I've looked at using collections.OrderedDict but PyYAML doesn't (seem to) support it. I've also looked at subclassing yaml.Dumper, but I haven't been able to figure out if it has the ability to change item order.

dreftymac
  • 31,404
  • 26
  • 119
  • 182
mwcz
  • 8,949
  • 10
  • 42
  • 63

9 Answers9

231

If you upgrade PyYAML to 5.1 version, now, it supports dump without sorting the keys like this:

yaml.dump(data, sort_keys=False)

As shown in help(yaml.Dumper), sort_keys defaults to True:

Dumper(stream, default_style=None, default_flow_style=False,
  canonical=None, indent=None, width=None, allow_unicode=None,
  line_break=None, encoding=None, explicit_start=None, explicit_end=None,
  version=None, tags=None, sort_keys=True)

(These are passed as kwargs to yaml.dump)

Louis Maddox
  • 5,226
  • 5
  • 36
  • 66
Cooper.Wu
  • 4,335
  • 8
  • 34
  • 42
49

There's probably a better workaround, but I couldn't find anything in the documentation or the source.


Python 2 (see comments)

I subclassed OrderedDict and made it return a list of unsortable items:

from collections import OrderedDict

class UnsortableList(list):
    def sort(self, *args, **kwargs):
        pass

class UnsortableOrderedDict(OrderedDict):
    def items(self, *args, **kwargs):
        return UnsortableList(OrderedDict.items(self, *args, **kwargs))

yaml.add_representer(UnsortableOrderedDict, yaml.representer.SafeRepresenter.represent_dict)

And it seems to work:

>>> d = UnsortableOrderedDict([
...     ('z', 0),
...     ('y', 0),
...     ('x', 0)
... ])
>>> yaml.dump(d, default_flow_style=False)
'z: 0\ny: 0\nx: 0\n'

Python 3 or 2 (see comments)

You can also write a custom representer, but I don't know if you'll run into problems later on, as I stripped out some style checking code from it:

import yaml

from collections import OrderedDict

def represent_ordereddict(dumper, data):
    value = []

    for item_key, item_value in data.items():
        node_key = dumper.represent_data(item_key)
        node_value = dumper.represent_data(item_value)

        value.append((node_key, node_value))

    return yaml.nodes.MappingNode(u'tag:yaml.org,2002:map', value)

yaml.add_representer(OrderedDict, represent_ordereddict)

But with that, you can use the native OrderedDict class.

EquipDev
  • 5,573
  • 10
  • 37
  • 63
Blender
  • 289,723
  • 53
  • 439
  • 496
  • Very nice, I like your style. I'll go with the first solution because I think it's a little more clear. I'll have to rebuild the dict either way, and the `MappingNode` call and strange unicode string in the representer make it kind of opaque (to me!). Thanks! – mwcz May 28 '13 at 01:19
  • @mwcz: The only problem with the first one is subclassing `OrderedDict`, so if it works, it works. – Blender May 28 '13 at 01:23
  • 2
    I'm not sure if it's my version of Python (3.4), but this isn't working. I looked in the source at `yaml/representer.py:111`, and you can see `mapping = sorted(mapping)`. It is using the `sorted` builtin, not the `.sort()` method of UnsortableList. Any ideas? – Hayk Martiros Nov 14 '14 at 06:31
  • Looking at the `PyYAML` source, it turns out that `dumper.represent_mapping` would do this if a single line was removed. See my answer for details. I think it would be worth submitting a request to have this as an option. – orodbhen Aug 31 '17 at 15:13
27

For Python 3.7+, dicts preserve insertion order. Since PyYAML 5.1.x, you can disable the sorting of keys (#254). Unfortunately, the sorting keys behaviour does still default to True.

>>> import yaml
>>> yaml.dump({"b":1, "a": 2})
'a: 2\nb: 1\n'
>>> yaml.dump({"b":1, "a": 2}, sort_keys=False)
'b: 1\na: 2\n'

My project oyaml is a monkeypatch/drop-in replacement for PyYAML. It will preserve dict order by default in all Python versions and PyYAML versions.

>>> import oyaml as yaml  # pip install oyaml
>>> yaml.dump({"b":1, "a": 2})
'b: 1\na: 2\n'

Additionally, it will dump the collections.OrderedDict subclass as normal mappings, rather than Python objects.

>>> from collections import OrderedDict
>>> d = OrderedDict([("b", 1), ("a", 2)])
>>> import yaml
>>> yaml.dump(d)
'!!python/object/apply:collections.OrderedDict\n- - - b\n    - 1\n  - - a\n    - 2\n'
>>> yaml.safe_dump(d)
RepresenterError: ('cannot represent an object', OrderedDict([('b', 1), ('a', 2)]))
>>> import oyaml as yaml
>>> yaml.dump(d)
'b: 1\na: 2\n'
>>> yaml.safe_dump(d)
'b: 1\na: 2\n'
wim
  • 338,267
  • 99
  • 616
  • 750
14

One-liner to rule them all:

yaml.add_representer(dict, lambda self, data: yaml.representer.SafeRepresenter.represent_dict(self, data.items()))

That's it. Finally. After all those years and hours, the mighty represent_dict has been defeated by giving it the dict.items() instead of just dict

Here is how it works:

This is the relevant PyYaml source code:

    if hasattr(mapping, 'items'):
        mapping = list(mapping.items())
        try:
            mapping = sorted(mapping)
        except TypeError:
            pass
    for item_key, item_value in mapping:

To prevent the sorting we just need some Iterable[Pair] object that does not have .items().

dict_items is a perfect candidate for this.

Here is how to do this without affecting the global state of the yaml module:

#Using a custom Dumper class to prevent changing the global state
class CustomDumper(yaml.Dumper):
    #Super neat hack to preserve the mapping key order. See https://stackoverflow.com/a/52621703/1497385
    def represent_dict_preserve_order(self, data):
        return self.represent_dict(data.items())    

CustomDumper.add_representer(dict, CustomDumper.represent_dict_preserve_order)

return yaml.dump(component_dict, Dumper=CustomDumper)
Ark-kun
  • 6,358
  • 2
  • 34
  • 70
  • 2
    The approach of adding a representer for `dict` won't work reliably on versions of Python prior to 3.7. See [this Q](https://stackoverflow.com/questions/39980323/are-dictionaries-ordered-in-python-3-6) and its answers. I was looking at your answer and puzzled over the fact that the output was ordered in key insertion order *despite* using `dict` rather than `OrderedDict`. Fortunately, the method used here can be easily adapted to `OrderedDict` for those who need it: add a representer for `OrderedDict` instead of `dict`, with the same implementation and it works. – Louis Jan 25 '19 at 15:00
3

This is really just an addendum to @Blender's answer. If you look in the PyYAML source, at the representer.py module, You find this method:

def represent_mapping(self, tag, mapping, flow_style=None):
    value = []
    node = MappingNode(tag, value, flow_style=flow_style)
    if self.alias_key is not None:
        self.represented_objects[self.alias_key] = node
    best_style = True
    if hasattr(mapping, 'items'):
        mapping = mapping.items()
        mapping.sort()
    for item_key, item_value in mapping:
        node_key = self.represent_data(item_key)
        node_value = self.represent_data(item_value)
        if not (isinstance(node_key, ScalarNode) and not node_key.style):
            best_style = False
        if not (isinstance(node_value, ScalarNode) and not node_value.style):
            best_style = False
        value.append((node_key, node_value))
    if flow_style is None:
        if self.default_flow_style is not None:
            node.flow_style = self.default_flow_style
        else:
            node.flow_style = best_style
    return node

If you simply remove the mapping.sort() line, then it maintains the order of items in the OrderedDict.

Another solution is given in this post. It's similar to @Blender's, but works for safe_dump. The common element is the converting of the dict to a list of tuples, so the if hasattr(mapping, 'items') check evaluates to false.

Update:

I just noticed that The Fedora Project's EPEL repo has a package called python2-yamlordereddictloader, and there's one for Python 3 as well. The upstream project for that package is likely cross-platform.

orodbhen
  • 2,644
  • 3
  • 20
  • 29
2

There are two things you need to do to get this as you want:

  • you need to use something else than a dict, because it doesn't keep the items ordered
  • you need to dump that alternative in the appropriate way.¹
import sys
import ruamel.yaml
from ruamel.yaml.comments import CommentedMap

yaml = ruamel.yaml.YAML()    

d = CommentedMap()
d['z'] = 0
d['y'] = 0
d['x'] = 0

yaml.dump(d, sys.stdout)

output:

z: 0
y: 0
x: 0

¹ This was done using ruamel.yaml a YAML 1.2 parser, of which I am the author.

Anthon
  • 69,918
  • 32
  • 186
  • 246
  • Python >= 3.7 now maintains insertion order, but ruamel.yaml seems to ignore the order... – pepoluan May 17 '23 at 05:00
  • I just copy-pasted the program above and it gives the correct output. I'll update the answer to the new API, but that will not make a difference. If you experience different behaviour please post a question. – Anthon May 17 '23 at 08:30
  • Hmm, I can only get the desired behavior (using new API) by setting `yaml.Representer` to `RoundTripRepresenter`. Eh, problem solved for me :-) – pepoluan May 17 '23 at 09:37
0

If safe_dump (i.e. dump with Dumper=SafeDumper) is used, then calling yaml.add_representer has no effect. In such case it is necessary to call add_representer method explicitly on SafeRepresenter class:

yaml.representer.SafeRepresenter.add_representer(
    OrderedDict, ordered_dict_representer
)
Peter Bašista
  • 749
  • 9
  • 22
-1

I was also looking for an answer to the question "how to dump mappings with the order preserved?" I couldn't follow the solution given above as i am new to pyyaml and python. After spending some time on the pyyaml documentation and other forums i found this.

You can use the tag

!!omap

to dump the mappings by preserving the order. If you want to play with the order i think you have to go for keys:values

The links below can help for better understanding.

https://bitbucket.org/xi/pyyaml/issue/13/loading-and-then-dumping-an-omap-is-broken

http://yaml.org/type/omap.html

  • 7
    Could you add a few lines of code as an example? Although this isn't quite a link-only answer, it won't leave a lot to go on if the links break, and example code is more convenient for us lazy people. For more, please refer to this question about StackOverflow style: http://meta.stackexchange.com/questions/8231/are-answers-that-just-contain-links-elsewhere-really-good-answers (this link is unlikely to rot ) – Michael Scheper Feb 23 '17 at 16:55
-1

The following setting makes sure the content is not sorted in the output:

yaml.sort_base_mapping_type_on_output = False
Wai Ha Lee
  • 8,598
  • 83
  • 57
  • 92