Use Python with lxml to parse a xml document and write elements into a text file

Question

With the following Python code I want to parse a xml file. An extract of the xml file you can see below the code. I need to "extract" everything which is behind "inv: name =" like in this case "'datasource roof height' and (value = 1000 or value = 2000 or value = 3000 or value = 4000 or value = 5000 or value = 6000)". Any ideas?

My Python code (so far):

from lxml import etree
doc = etree.parse("data.xml")
for con in doc.xpath("//specification"):
    for cons in con.xpath("./@body"):
        with open("output.txt", "w") as cons_out:
            cons_out.write(cons)
        cons_out.close()

Part of the xml file:

<ownedRule xmi:type="uml:Constraint" xmi:id="EAID_OR000004_EE68_4efa_8E1B_8DDFA8F95FB8" name="datasource roof height">
    <constrainedElement xmi:idref="EAID_94F3B0A6_EE68_4efa_8E1B_8DDFA8F95FB8"/>
    <specification xmi:type="uml:OpaqueExpression" xmi:id="EAID_COE000004_EE68_4efa_8E1B_8DDFA8F95FB8" body="inv: name = 'datasource roof height'  and (value = 1000 or value = 2000 or value = 3000 or value = 4000 or value = 5000 or value = 6000)"/>
</ownedRule>

score 0 · Accepted Answer · answered Oct 25 '15 at 05:18

XML Parsers understand attributes and elements. What is present within these attributes or elements (the textual content) is of no concern to the XML parser.

In order to solve your problem you would need to split the string retrieved from the body attribute. Of course, I am assuming that the body attribute for all elements would have the same format content i.e. "inv : name = some content"

from lxml import etree
doc = etree.parse("data.xml")
for con in doc.xpath("//specification"):
    for cons in con.xpath("./@body"):
        with open("output.txt", "w") as cons_out:
            content = cons.split("inv: name =")[1]
            cons_out.write(content)
        cons_out.close()

Another question is how I can write the parsed content line by line to a text document like this: "new line" `inv: name = 'datasource roof height' and (value = 1000 or value = 2000 or value = 3000)` "new line" `inv: name = 'datasource ground height' and (value = 1100 or value = 2100 or value = 3100 )` — F.D.S, Oct 25 '15 at 20:24

Use Python with lxml to parse a xml document and write elements into a text file

1 Answers1