39

I've been trying to dump a dictionary to a YAML file. The problem is that the program that imports the YAML file needs the keywords in a specific order. This order is not alphabetically.

import yaml
import os 

baseFile = 'myfile.dat'
lyml = [{'BaseFile': baseFile}]
lyml.append({'Environment':{'WaterDepth':0.,'WaveDirection':0.,'WaveGamma':0.,'WaveAlpha':0.}})

CaseName = 'OrderedDict.yml'
CaseDir = r'C:\Users\BTO\Documents\Projects\Mooring code testen'
CaseFile = os.path.join(CaseDir, CaseName)
with open(CaseFile, 'w') as f:
    yaml.dump(lyml, f, default_flow_style=False)

This produces a *.yml file which is formatted like this:

- BaseFile: myfile.dat
- Environment:
    WaterDepth: 0.0
    WaveAlpha: 0.0
    WaveDirection: 0.0
    WaveGamma: 0.0

But what I want is that the order is preserved:

- BaseFile: myfile.dat
- Environment:
    WaterDepth: 0.0
    WaveDirection: 0.0
    WaveGamma: 0.0
    WaveAlpha: 0.0

Is this possible?

dreftymac
  • 31,404
  • 26
  • 119
  • 182
Ben
  • 426
  • 1
  • 4
  • 5

5 Answers5

65

yaml.dump has a sort_keys keyword argument that is set to True by default. Set it to False to not reorder:

with open(CaseFile, 'w') as f:
    yaml.dump(lyml, f, default_flow_style=False, sort_keys=False)
mkrieger1
  • 19,194
  • 5
  • 54
  • 65
Eric
  • 651
  • 5
  • 2
  • 2
    `TypeError: dump_all() got an unexpected keyword argument 'sort_keys'` – James May 13 '20 at 09:50
  • I can confirm using sort_keys=False solves the problem, perfectly. I am using PyYAML==6.0.1 in Python 3.10. You are a life saver mate! – ENDEESA Aug 27 '23 at 18:17
35

Use an OrderedDict instead of dict. Run the below setup code at the start. Now yaml.dump, should preserve the order. More details here and here

def setup_yaml():
  """ https://stackoverflow.com/a/8661021 """
  represent_dict_order = lambda self, data:  self.represent_mapping('tag:yaml.org,2002:map', data.items())
  yaml.add_representer(OrderedDict, represent_dict_order)    
setup_yaml()

Example: https://pastebin.com/raw.php?i=NpcT6Yc4

balki
  • 26,394
  • 30
  • 105
  • 151
  • Somehow I am not able to write in a code block. My point is, dat using the above solution, the required structure cannot be obtained? As OrderedDict(['Environment',(WaterDepth, 0), (WaveDirection, 0 ,(WaveGamma, 0),(WaveAlpha,0)]) isn't correct. Any suggestions? – Ben Jul 24 '15 at 14:02
8

PyYAML supports representer to serialize a class instance to a YAML node.

yaml.YAMLObject uses metaclass magic to register a constructor, which transforms a YAML node to a class instance, and a representer, which serializes a class instance to a YAML node.

Add following lines above your code:

def represent_dictionary_order(self, dict_data):
    return self.represent_mapping('tag:yaml.org,2002:map', dict_data.items())

def setup_yaml():
    yaml.add_representer(OrderedDict, represent_dictionary_order)

setup_yaml()

Then you can use OrderedDict to preserve the order in yaml.dump():

import yaml
from collections import OrderedDict

def represent_dictionary_order(self, dict_data):
    return self.represent_mapping('tag:yaml.org,2002:map', dict_data.items())

def setup_yaml():
    yaml.add_representer(OrderedDict, represent_dictionary_order)

setup_yaml()    

dic = OrderedDict()

dic['a'] = 1
dic['b'] = 2
dic['c'] = 3

print(yaml.dump(dic))
# {a: 1, b: 2, c: 3}
Akif
  • 6,018
  • 3
  • 41
  • 44
1

Your difficulties are a result of assumptions on multiple levels that are incorrect and, depending on your YAML parser, might not be transparently resolvable.

In Python's dict the keys are unordered (at least for Python < 3.6). And even though the keys have some order in the source file, as soon as they are in the dict they aren't:

d = {'WaterDepth':0.,'WaveDirection':0.,'WaveGamma':0.,'WaveAlpha':0.}
for key in d:
    print key

gives:

WaterDepth
WaveGamma
WaveAlpha
WaveDirection

If you want your keys ordered you can use the collections.OrderedDict type (or my own ruamel.ordereddict type which is in C and more than an order of magnitude faster), and you have to add the keys ordered, either as a list of tuples:

from ruamel.ordereddict import ordereddict
# from collections import OrderedDict as ordereddict  # < this will work as well
d = ordereddict([('WaterDepth', 0.), ('WaveDirection', 0.), ('WaveGamma', 0.), ('WaveAlpha', 0.)])
for key in d:
    print key

which will print the keys in the order they were specified in the source.

The second problem is that even if a Python dict has some key ordering that happens to be what you want, the YAML specification does explicitly say that mappings are unordered and that is the way e.g. PyYAML implements the dumping of Python dict to YAML mapping (And the other way around). Also, if you dump an ordereddict or OrderedDict you normally don't get the plain YAML mapping that you indicate you want, but some tagged YAML entry.

As losing the order is often undesirable, in your case because your reader assumes some order, in my case because that made it difficult to compare versions because key ordering would not be consistent after insertion/deletion, I implemented round-trip consistency in ruamel.yaml so you can do:

import sys
import ruamel.yaml as yaml

yaml_str = """\
- BaseFile: myfile.dat
- Environment:
    WaterDepth: 0.0
    WaveDirection: 0.0
    WaveGamma: 0.0
    WaveAlpha: 0.0
"""

data = yaml.load(yaml_str, Loader=yaml.RoundTripLoader)
print(data)
yaml.dump(data, sys.stdout, Dumper=yaml.RoundTripDumper)

which gives you exactly your output result. data works as a dict (and so does `data['Environment'], but underneath they are smarter constructs that preserve order, comments, YAML anchor names etc). You can of course change these (adding/deleting key-value pairs), which is easy, but you can also build these from scratch:

import sys
import ruamel.yaml as yaml
from ruamel.yaml.comments import CommentedMap

baseFile = 'myfile.dat'
lyml = [{'BaseFile': baseFile}]
lyml.append({'Environment': CommentedMap([('WaterDepth', 0.), ('WaveDirection', 0.), ('WaveGamma', 0.), ('WaveAlpha', 0.)])})
yaml.dump(data, sys.stdout, Dumper=yaml.RoundTripDumper)

Which again prints the contents with keys in the order you want them. I find the later less readable, than when starting from a YAML string, but it does construct the lyml data structure somewhat faster.

Anthon
  • 69,918
  • 32
  • 186
  • 246
1

oyaml is a python library which preserves dict ordering when dumping. It is specifically helpful in more complex cases where the dictionary is nested and may contain lists.

Once installed:

import oyaml as yaml

with open(CaseFile, 'w') as f:
    f.write(yaml.dump(lyml))