1

wrote some python(3.7) code to dump json to yaml and insert the result to another yaml code like :

sample_code = '''
A :
  - DATA_A

B :
  C :
    {}
'''

import yaml

json_code = { 'json' : { D: VALUE_D, E: VALUE_E } }
sample_code.format( yaml.dump(json_code) )

After write the formatted code to file, I got :

A :
  - DATA_A
B :
  C :
    json :
  D : VALUE_D
  E : VALUE_E

I want to get a result like :

A :
  - DATA_A
B :
  C :
    json :
      D : VALUE_D
      E : VALUE_E

I just used a shapeless method :

json_code_dumped = yaml.dump(json_code).replace("  ", "          ")
samplecode.format( json_code_dumped )

and this works anyway for now. Is there any smarter way to do like this?

Thank you.

Rivian
  • 45
  • 1
  • 6

1 Answers1

0

The correct way would be to have the whole YAML data loaded as in-memory data and to insert the JSON data there. This guarantees dumping will produce valid YAML.

Now there are different ways to approach this. Here's the simplest one:

import yaml

sample_code = '''
A :
  - DATA_A

B :
  C : {}
'''

data = yaml.safe_load(sample_code)
data["B"]["C"]["json"] = { "D": "VALUE_D", "E": "VALUE_E" }
print(yaml.dump(data))

The {} is now interpreted as empty mapping since it is loaded with YAML. This is why you can put a "json" key in it. The output looks like this:

A:
- DATA_A
B:
  C:
    json:
      D: VALUE_D
      E: VALUE_E

The formatting changes because YAML does not preserve formatting (for details see this question).


A more sophisticated approach is shown the answer to this question which uses the low-level event API and therefore lets you specify the position you want to insert at as list of strings, and it lets you specify the desired format of scalars in more detail. However, it is much more complicated and requires you to give the additional data as YAML events. However, if you have actual JSON input, you can load that to YAML events since YAML is a superset of JSON.


Here's a third approach which lets you specify the placeholder in the YAML file like you currently do:

from yaml.loader import SafeLoader
import yaml

class MyLoader(SafeLoader):
  def __init__(self, stream):
    SafeLoader.__init__(self, stream)
    self.replacements = []

def replace_placeholder(loader, node):
  index = int(loader.construct_scalar(node))
  return loader.replacements[index]

MyLoader.add_constructor('!placeholder', replace_placeholder)


sample_code = '''
A :
  - DATA_A

B :
  C : !placeholder 0
'''

loader = MyLoader(sample_code)
loader.replacements.append(
  { 'json' : { 'D': 'VALUE_D', 'E': 'VALUE_E' }})
print(yaml.dump(loader.get_single_data()))

Much like with format, you have a placeholder in the YAML file which is identified by the local tag !placeholder. I give it a number to show that you could have multiple placeholders. You put the data you want to replace the placeholder(s) with into the loader's replacements and then during loading, our custom constructor will replace the placeholders with the actual data. The result is the same as with the first approach.

flyx
  • 35,506
  • 7
  • 89
  • 126
  • The third approach worked perfectly for me. The simplest approach is failed with 'TypeError : list indices must be integers, not str'. I'm not sure why...anyway, thank you very much for your nice answer. – Rivian May 18 '20 at 07:19