1

I want to load some YAML data in a Python script, but instead of using regular dict for mapping, I would like to use my custom class (which keeps the insertion order and also merges keys/subdictionaries instead of overwriting them). I don't want to alter any other YAML types, only mapping. Browsing through the net, pyyaml documentation and SO I don't see any clear an generic solution - in almost all cases undocumented features of pyyaml are used (for example most of the solutions here). Initially I was thinking about inheriting from yaml.Loader and reimplementing construct_mapping(), but it seems that it would also require to use some of pyyaml internals... But maybe it would "just work" when done like this:

# in my custom loader
def construct_mapping(self, node, deep=False):
  mapping = yaml.Loader.construct_mapping(self, node, deep)
  # do my own stuff with mapping, changing it to my own type
  return mapping

?

Maybe I should just add my custom constructor and use !!map as the YAML tag which will be matched? Will this work that way, even though typically mapping has no explicit tag in YAML file?

Am I missing some obvious solution here, or maybe reimplementing construct_mapping() is the easiest approach?

Freddie Chopin
  • 8,440
  • 2
  • 28
  • 58
  • AFAIK you cannot do this without non-officially documented features, e.g. because you fail to construct your mapping in a two step process. That is undocumented, but not doing so will certainly give you problems when using non-restricted YAML. Additionaly: 1) if things are on SO, you should consider them documented. 2) PyYAML is stable (read: barely maintained), so these "undocumented features" are extermely unlikely to change. – Anthon Jan 09 '18 at 08:44
  • Possible duplicate of [In Python, how can you load YAML mappings as OrderedDicts?](https://stackoverflow.com/questions/5121931/in-python-how-can-you-load-yaml-mappings-as-ordereddicts) – Anthon Jan 09 '18 at 08:44
  • @Anthon - the problem with solutions in the answer I linked is that I have absolutely no idea what is going on there, why particular functions are used (for example why `construct_pairs()` instead of `construct_mapping()` and what's `flatten_mapping()` doing actually) (; That's why I would prefer something more "straightforward", as there would be less guessing. – Freddie Chopin Jan 09 '18 at 08:53
  • What is straighforward and what is not is of course relative. This is undocumented in the official sources and I think it is not straightforward in your sense of the that word. That means **you** cannot do it in the *straightforward* way that **you** want. Either study the PyYAML sources for a couple of days (as I did) to make doing this straightforward, or accept that there is a straightforward solution (that you-understandably-don't fully grasp). If those solutions don't work, then come back here with a concrete question (and your not working source code). – Anthon Jan 09 '18 at 11:54
  • @Anthon - would this be easier to do in ruamel.yaml than in pyyaml? – Freddie Chopin Jan 09 '18 at 22:01
  • No, it would be harder (if you want to do it for the BaseLoader as well as for the RoundTripLoader). The type is hard-coded in the `constructed_mapping` method PyYAML and the non-roundtrip loaders in ruamel.yaml (inherited from PyYAML) and in `construct_yaml_map` for the `RoundTripLoader` – Anthon Jan 10 '18 at 14:22

0 Answers0