1

With this solution I have an issue.

Merging xml-files.

My problem is, that I have to consider the values also.

And the duplicates has to be removed.

In this excample the entries with FileB.txt are duplicated.

FileA.xml

<update>
  <Files>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileA.txt</SrcFile>
      <DestFile>\FolderB\FileA.txt</DestFile>
    </CopyFile>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileB.txt</SrcFile>
      <DestFile>\FolderB\FileB.txt</DestFile>
    </CopyFile>
  </Files>
</update>

FileB.xml

<update>
  <Files>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileB.txt</SrcFile>
      <DestFile>\FolderB\FileB.txt</DestFile>
    </CopyFile>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileC.txt</SrcFile>
      <DestFile>\FolderB\FileC.txt</DestFile>
    </CopyFile>
  </Files>
</update>

expected Result.xml

<update>
  <Files>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileA.txt</SrcFile>
      <DestFile>\FolderB\FileA.txt</DestFile>
    </CopyFile>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileB.txt</SrcFile>
      <DestFile>\FolderB\FileB.txt</DestFile>
    </CopyFile>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileC.txt</SrcFile>
      <DestFile>\FolderB\FileC.txt</DestFile>
    </CopyFile>
  </Files>
</update>

If I change the mapping to

    mapping = {(el.tag, hashabledict(el.attrib), el.text): el for el in one}

the parent element CopyFile is missing.

My result will be

<update>
  <Files>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileA.txt</SrcFile>
      <DestFile>\FolderB\FileA.txt</DestFile>
    </CopyFile>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileB.txt</SrcFile>
      <DestFile>\FolderB\FileB.txt</DestFile>
      <SrcFile>\FolderA\FileC.txt</SrcFile>
      <DestFile>\FolderB\FileC.txt</DestFile>
    </CopyFile>
  </Files>
</update>

any ideas ?

mswyss
  • 322
  • 3
  • 12

1 Answers1

2

Below

import xml.etree.ElementTree as ET

xml1 = '''<update>
  <Files>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileA.txt</SrcFile>
      <DestFile>\FolderB\FileA.txt</DestFile>
    </CopyFile>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileB.txt</SrcFile>
      <DestFile>\FolderB\FileB.txt</DestFile>
    </CopyFile>
  </Files>
</update>'''
xml2 = '''<update>
  <Files>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileC.txt</SrcFile>
      <DestFile>\FolderB\FileC.txt</DestFile>
    </CopyFile>
  </Files>
</update>'''


root1 = ET.fromstring(xml1)
root2 = ET.fromstring(xml2)
copy_files = [e for e in root1.findall('.//CopyFile')]
src_files = set([e.find('./SrcFile').text for e in copy_files])
copy_files.extend([e for e in root2.findall('.//CopyFile') if e.find('./SrcFile').text not in src_files])

merged_root = ET.Element('update')
files = ET.SubElement(merged_root, 'files')
files.extend(copy_files)

ET.dump(merged_root)

output

<update><files><CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileA.txt</SrcFile>
      <DestFile>\FolderB\FileA.txt</DestFile>
    </CopyFile>
    <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileB.txt</SrcFile>
      <DestFile>\FolderB\FileB.txt</DestFile>
    </CopyFile>
  <CopyFile overwrite="FALSE">
      <SrcFile>\FolderA\FileC.txt</SrcFile>
      <DestFile>\FolderB\FileC.txt</DestFile>
    </CopyFile>
  </files></update>
balderman
  • 22,927
  • 7
  • 34
  • 52