1

I am trying to update a plist as follows

Match the string field "What change bugs are fixed in this submission?" and update the corresponding string field for<key>response</key>

The issue right now the code updates the string field What change bugs are fixed in this submission?,how do I update the corresponding response string field?I added the expected plist output aswell?is there a simpler way to do this python?where am I going wrong?

CODE:-

import re,os,fileinput
text1_to_search = re.compile(r'<string>What change bugs are fixed in this submission?.*</string>')
replacement1_text = """change://problem/219620> milestone: WCM-739#202 has failed to build in install: expected a type 
change://problem/215275> Fix logic for PSK-->Open update
change://problem/1265279> Hotspot keeps changing from the device I selected
"""

for line in fileinput.input(filename, inplace=True, backup='.bak'):
    print(text1_to_search.sub(replacement1_text, line)),

plist

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//company//DTD PLIST 1.0//EN" "http://www.company.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>28</key>
    <dict>
        <key>description</key>
        <string>Which update of macOS, Xcode, and the SDKs was this submission built on with 'abc buildit'?</string>
        <key>id</key>
        <string>28</string>
        <key>multiline</key>
        <string>0</string>
        <key>releases</key>
        <array>
            <string>milestone</string>
        </array>
        <key>response</key>
        <string></string>
    </dict>
    <key>7</key>
    <dict>
        <key>description</key>
        <string>What change bugs are fixed in this submission? (Please include the change number or URL followed by the title)</string>
        <key>id</key>
        <string>7</string>
        <key>multiline</key>
        <string>1</string>
        <key>releases</key>
        <array>
            <string>milestone</string>
        </array>
        <key>response</key>
        <string></string>
    </dict>
</dict>
</plist>

Expected output After update

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//company//DTD PLIST 1.0//EN" "http://www.company.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>28</key>
    <dict>
        <key>description</key>
        <string>Which update of macOS, Xcode, and the SDKs was this submission built on with 'abc buildit'?</string>
        <key>id</key>
        <string>28</string>
        <key>multiline</key>
        <string>0</string>
        <key>releases</key>
        <array>
            <string>milestone</string>
        </array>
        <key>response</key>
        <string></string>
    </dict>
    <key>7</key>
    <dict>
        <key>description</key>
        <string>What change bugs are fixed in this submission? (Please include the change number or URL followed by the title)</string>
        <key>id</key>
        <string>7</string>
        <key>multiline</key>
        <string>1</string>
        <key>releases</key>
        <array>
            <string>milestone</string>
        </array>
        <key>change://problem/219620> milestone: WCM-739#202 has failed to build in install: expected a type 
change://problem/215275> Fix logic for PSK-->Open update
change://problem/1265279> Hotspot keeps changing from the device I selected</key>
        <string></string>
    </dict>
</dict>
</plist>
carte blanche
  • 10,796
  • 14
  • 46
  • 65

1 Answers1

0

Don't use regular expression to parse XML. Use an XML parser. In Python, lxml is recommended.

Using XPath, this is a possible implementation.

from lxml import etree as et

with open("input.xml") as raw:
    # Parse the XML input file into a tree.
    tree = et.parse(raw)

    # Find the interesting <string> element by first finding the <string> to
    # use as key, then the parent <dict>, then the <key> that preceeds the
    # <string> to change. Since the xpath method returns a list, we take the
    # first element of each list.
    stringUsedAsKey = tree.xpath("/plist/dict/dict/string"
            + "[./text()=\"What change bugs are fixed in this submission? (Please include the change number or URL followed by the title)\"]")[0]
    interestingDict = stringUsedAsKey.getparent()
    stringToChange = interestingDict.xpath("key[text()=\"response\"]/following-sibling::string")[0]

    # Change the text of the <string>.
    stringToChange.text = "updated text"

    # Write the changed tree back to an XML file.
    tree.write("output.xml", pretty_print=True, xml_declaration=True, encoding="UTF-8")
  • above code replaces `` to `` anywhere in the xml ,why is that and how to fix it? – carte blanche Jul 16 '18 at 18:26
  • `` and `` are identical, they both are an empty element named `string`. Any sane XML parse must understand both. There probably is a way to force lxml to not shorten empty elements, though. –  Jul 16 '18 at 19:31
  • https://stackoverflow.com/questions/34111154/how-can-i-prevent-lxml-from-auto-closing-empty-elements-when-serializing-to-stri seems to help,used `write_c14n` but the xml declaration at the top is missing ,do you know how to get this? – carte blanche Jul 16 '18 at 20:14
  • Use `xml_declaration=True` as in my answer. –  Jul 17 '18 at 06:58