1

Say input a.xml:

<?xml version="1.0" encoding="utf-8"?>
<permissions>
  <privapp-permissions package="a">
        <permission name="x"/>
        <permission name="y"/>
  </privapp-permissions>
</permissions>

And b.xml:

<?xml version="1.0" encoding="utf-8"?>
<permissions>
  <privapp-permissions package="a">
        <permission name="x"/>
        <permission name="z"/>
  </privapp-permissions>
</permissions>

Expected output c.xml:

<?xml version="1.0" encoding="utf-8"?>
<permissions>
  <privapp-permissions package="a">
        <permission name="x"/>
        <permission name="y"/>
        <permission name="z"/>
  </privapp-permissions>
</permissions>

I write below code but it just append it not merge:

import xml.etree.ElementTree as ET

fname1 = "a.xml"
fname2 = "b.xml"
fname3 = "c.xml"

tree1 = ET.parse(fname1)
root1 = tree1.getroot()

tree2 = ET.parse(fname2)
root2 = tree2.getroot()

merged_tree = ET.ElementTree(ET.Element('root'))

merged_tree.getroot().append(root1)
merged_tree.getroot().append(root2)

merged_tree.write(fname3, encoding='utf-8')
lucky1928
  • 8,708
  • 10
  • 43
  • 92

2 Answers2

4

You can manage the duplicates with a list:

import xml.etree.ElementTree as ET


a= """<?xml version="1.0" encoding="utf-8"?>
<permissions>
  <privapp-permissions package="a">
        <permission name="x"/>
        <permission name="y"/>
  </privapp-permissions>
</permissions>"""

b= """<?xml version="1.0" encoding="utf-8"?>
<permissions>
  <privapp-permissions package="a">
        <permission name="x"/>
        <permission name="z"/>
  </privapp-permissions>
</permissions>"""

c= """<?xml version="1.0" encoding="utf-8"?>
<permissions>
  <privapp-permissions package="a" />
</permissions>"""

tree = ET.fromstring(a)
tree1 = ET.fromstring(b)
l1 = tree.findall('.//permission')
l2 = tree1.findall('.//permission')

tree2 = ET.fromstring(c)

new = tree2.find('privapp-permissions')
new.extend(l1)
new.extend(l2)

d = []
for elem in tree2.iter('permission'):
    if elem.get('name') not in d:
        d.append(elem.get('name'))

for parent in tree2.findall('.//privapp-permissions'):
    for elem in tree2.iter('permission'):
        if elem.get('name') in d:
            d.remove(elem.get('name'))
        else:
            parent.remove(elem)
                          
root = ET.ElementTree(tree2)
ET.indent(root, space="  ", level=0)
#root.write(file_name, xml_declaration = True, encoding="utf-8")
ET.dump(root)

Output:

<permissions>
  <privapp-permissions package="a">
    <permission name="x" />
    <permission name="y" />
    <permission name="z" />
  </privapp-permissions>
</permissions>
Hermann12
  • 1,709
  • 2
  • 5
  • 14
2

It is not clear whether you want to implement some generic merge and what the criteria would be or whether you want to merge that particular data; in the latter case it seems you want to group privapp-permissions by @package and inside the inner permission by name; with Python 3 you can use SaxonC HE (package saxonche) to use XSLT 3 as follows:

<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="3.0"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  exclude-result-prefixes="#all">
  
  <xsl:template match="permissions">
    <xsl:copy>
      <xsl:for-each-group select="privapp-permissions, $doc2/permissions/privapp-permissions" group-by="@package">
        <xsl:copy>
          <xsl:copy-of select="@*"/>
          <xsl:for-each-group select="current-group()/*" group-by="@name">
            <xsl:copy-of select="."/>
          </xsl:for-each-group>
        </xsl:copy>
      </xsl:for-each-group>
    </xsl:copy>
  </xsl:template>

  <xsl:output method="xml" indent="yes"/>
  
  <xsl:param name="doc2" as="document-node()" select="doc('b.xml')"/>

</xsl:stylesheet>

Python code:

from saxonche import PySaxonProcessor

with PySaxonProcessor(license=False) as saxon:
    xslt30_processor = saxon.new_xslt30_processor()
    xslt30_processor.transform_to_file(source_file='a.xml', stylesheet_file='merge-with-grouping.xsl', output_file='merged-result.xml')

https://gist.github.com/martin-honnen/93d9290969b70646e68a5d2097ab3cf5

Martin Honnen
  • 160,499
  • 6
  • 90
  • 110