According to (for instance) this discussion, newlines in strings in XML files are allowed. I have such a file and I need to manipulate it. To be specific, From certain subelements with an existing id, I need to create duplicates with a new id. I have a test python script to do this:
import xml.etree.ElementTree as ET
from xml.dom import minidom
import copy
def double_override_entries(input_file, search_id, replace_id, output_file):
tree = ET.parse(input_file)
root = tree.getroot()
for override_elem in root.findall(".//override"):
for language_elem in override_elem.findall(".//language[@id='{}']".format(search_id)):
language_elem_copy = copy.deepcopy(language_elem)
language_elem_copy.set('id', replace_id)
override_elem.append( language_elem_copy)
xml_str = ET.tostring(root, encoding='utf-8').decode('utf-8')
with open(output_file, 'w', encoding='utf-8') as file:
file.write(xml_str)
input_file = "input.xml"
search_id = "yy"
replace_id = "zz"
output_file = "output.xml"
double_override_entries(input_file, search_id, replace_id, output_file)
This works, except for one thing: the newlines inside strings in the original XML have been removed. I'm looking for a way to do this same manipulation, but preserving these newlines inside property value strings.
Input:
<?xml version="1.0" encoding="UTF-8"?>
<model>
<overrides>
<override identifier="id-12391" labelPlacement="manual" labelOffset="(35,20)">
<language id="en" labelText="Internal
Exploitation
Dependencies"/>
<language id="yy" labelText="Internal
Exploitation
Dependencies"/>
</override>
</overrides>
</model>
Output:
<model>
<overrides>
<override identifier="id-12391" labelPlacement="manual" labelOffset="(35,20)">
<language id="en" labelText="Internal Exploitation Dependencies" />
<language id="yy" labelText="Internal Exploitation Dependencies" />
<language id="zz" labelText="Internal Exploitation Dependencies" />
</override>
</overrides>
</model>
[Update] ElementTree is compliant (see answer below), so removed that statement from the question. I still would like a python workaround for this non-compliant use, though. [/Update]
Any trick to do this would be fine, though I need to run it on macOS + MacPorts (currently using python 3.11)