14

How do you control how the order in which PyYaml outputs key/value pairs when serializing a Python dictionary?

I'm using Yaml as a simple serialization format in a Python script. My Yaml serialized objects represent a sort of "document", so for maximum user-friendliness, I'd like my object's "name" field to appear first in the file. Of course, since the value returned by my object's __getstate__ is a dictionary, and Python dictionaries are unordered, the "name" field will be serialized to a random location in the output.

e.g.

>>> import yaml
>>> class Document(object):
...     def __init__(self, name):
...         self.name = name
...         self.otherstuff = 'blah'
...     def __getstate__(self):
...         return self.__dict__.copy()
... 
>>> doc = Document('obj-20111227')
>>> print yaml.dump(doc, indent=4)
!!python/object:__main__.Document
otherstuff: blah
name: obj-20111227
Cerin
  • 60,957
  • 96
  • 316
  • 522
  • For the record, there is a similar question (asked a couple of years after this one) here: http://stackoverflow.com/q/16782112/877069 – Nick Chammas May 22 '15 at 02:41
  • Since this question was asked Python `dict` retains the order that the keys are added. – NeilG Aug 04 '23 at 07:43

4 Answers4

20

Took me a few hours of digging through PyYAML docs and tickets, but I eventually discovered this comment that lays out some proof-of-concept code for serializing an OrderedDict as a normal YAML map (but maintaining the order).

e.g. applied to my original code, the solution looks something like:

>>> import yaml
>>> from collections import OrderedDict
>>> def dump_anydict_as_map(anydict):
...     yaml.add_representer(anydict, _represent_dictorder)
... 
>>> def _represent_dictorder( self, data):
...     if isinstance(data, Document):
...         return self.represent_mapping('tag:yaml.org,2002:map', data.__getstate__().items())
...     else:
...         return self.represent_mapping('tag:yaml.org,2002:map', data.items())
... 
>>> class Document(object):
...     def __init__(self, name):
...         self.name = name
...         self.otherstuff = 'blah'
...     def __getstate__(self):
...         d = OrderedDict()
...         d['name'] = self.name
...         d['otherstuff'] = self.otherstuff
...         return d
... 
>>> dump_anydict_as_map(Document)
>>> doc = Document('obj-20111227')
>>> print yaml.dump(doc, indent=4)
!!python/object:__main__.Document
name: obj-20111227
otherstuff: blah
Ethan Z
  • 208
  • 3
  • 13
Cerin
  • 60,957
  • 96
  • 316
  • 522
15

New Solution (as of 2020 and PyYAML 5.1)

You can dump a dictionary in its current order by simply using

yaml.dump(data, default_flow_style=False, sort_keys=False)
Phil B
  • 5,589
  • 7
  • 42
  • 58
  • 1
    Thank you so much, it's so great to know that such a simple option exists in the latest version. Just made my day! – Ainz Titor Dec 13 '20 at 19:48
  • Please upvote this to the top solution to help others avoid wasting time on custom solutions now that it's built in to the library. – Phil B Mar 08 '23 at 04:09
  • I couldn't find this key word argument documented in https://pyyaml.org/wiki/PyYAMLDocumentation – NeilG Aug 04 '23 at 07:44
6

I think the problem is when you dump the data. I looked into the code of PyYaml and there is a optional argument called sort_keys, setting that value to False seems to do the trick.

Sutsuj
  • 73
  • 1
  • 10
  • 2
    This answer is what I was looking for. If you set `sort_keys` to `False`, PyYaml will respect your dictionary ordering. ```python yaml.dump(data, file, sort_keys=False) ``` – Voxel Minds Jun 02 '20 at 10:39
-10

The last time I checked, Python's dictionaries weren't ordered. If you really want them to be, I strongly recommend using a list of key/value pairs.

[
    ('key', 'value'),
    ('key2', 'value2')
]

Alternatively, define a list with the keys and put them in the right order.

keys = ['key1', 'name', 'price', 'key2'];
for key in keys:
    print obj[key]
Tom van der Woerdt
  • 29,532
  • 7
  • 72
  • 105
  • 4
    Like my post says, I know Python dictionaries are unordered. Unfortunately, there's a big difference in Yaml readability between a dictionary and a list of tuples, so this won't work in my case. – Cerin Dec 28 '11 at 02:28
  • 1
    Python dictionaries are ordered as of 3.6 – Mattwmaster58 Feb 22 '20 at 21:28