0

Assuming we have the following xml files:

file1.xml:

<main>
<tag1 name="name1">
    <t1>text1</t1>
    <t2></t2>
    <t3></t3>
    <t4></t4>
    <t5>text5</t5>
    <t6>text6</t6>
</tag1>
</main>

file2.xml:

<main>
<tag1 name="name1">
    <t1>text1</t1>
    <t2></t2>
    <t3></t3>
    <t4>text4</t4>
    <t5>text5</t5>
    <t6>text6</t6>
    <t7>text7</t7>
    <t8></t8>
</tag1>
</main>

For each tag1 tag with same name attribute in file1 and file2, I want to generate a third file with all tx tags of file1 plus tx tags that are in file2 and are not in file1, and also plus textx content that is on file2 and not on file1 even if corresponding tx tag exists on both. I want to do this using python. Look at file_out.xml below for better understanding

file_out.xml:

<main>
<tag1 name="name1">
    <t1>text1</t1>
    <t2></t2>
    <t3></t3>
    <t4>text4</t4>
    <t5>text5</t5>
    <t6>text6</t6>
    <t7>text7</t7>
    <t8></t8>
</tag1>
</main>
  • [This](http://stackoverflow.com/questions/14878706/merge-xml-files-with-nested-elements-without-external-libraries) and [this](http://stackoverflow.com/questions/15921642/merging-xml-files-using-pythons-elementtree) may help you. – Anzel Apr 11 '15 at 23:10
  • Hi Anzel, I already solve it. Thanks for commenting. I already post the answer. – Luis Chaidez Apr 13 '15 at 00:54

1 Answers1

0

I already solve the problem, I ask the question without any code snippet because I'm new to XML parsing and to Python. Notice that on the XML examples I put, tag1 is really game, so this is how I solve it:

import xml.etree.ElementTree as ET
import sys, os

file1_path = sys.argv[1]
file2_path = sys.argv[2]
file_out_path = sys.argv[3]

if os.path.exists(file_out_path):
    os.remove(file_out_path)

tree1 = ET.parse(file1_path)
root1 = tree1.getroot()

tree2 = ET.parse(file2_path)
root2 = tree2.getroot()

for game2 in root2.iter ('game'):
    name2 = game2.get('name')
    found = False
    for game1 in root1.iter ('game'):
        name1 = game1.get('name')
        if name1 == name2:
            found = True
            break
    if not found:
        root1.append(game2)

#######################

for game1 in root1.iter ('game'):
    name1 = game1.get('name')
    for game2 in root2.iter('game'):
        name2 = game2.get('name')
        if name1 == name2:
            for tag2 in game2:
                tag1 = game1.find(tag2.tag)
                if tag1 is None:
                    game1.append(tag2)
                else:
                    if (tag1.text is None) or (tag1.text is " ") or (tag1.text is ""):
                        tag1.text = tag2.text

################################

tree1.write(file_out_path, method='html')
sys.exit(0)