1

I have a XML of the next type:

<OUTPUT>
    <HEADER>

    </HEADER>
    <REGISTER>
        <RESULT>0</RESULT>
        <KEY1>CAR</KEY1>
        <KEY2>RED</KEY2>
        <KEY3>2013</KEY3>
        <ATTRIBUTE1>2000</ATTRIBUTE1>
        <ATTRIBUTE2>100000</ATTRIBUTE2>
    </REGISTER>
    <REGISTER>
        <RESULT>0</RESULT>
        <KEY1>TRUCK</KEY1>
        <KEY2>BLUE</KEY2>
        <KEY3>2014</KEY3>
        <ATTRIBUTE1>3000</ATTRIBUTE1>
        <ATTRIBUTE2>400000</ATTRIBUTE2>
    </REGISTER>
<OUTPUT>

How can I search the value of the ATTRIBUTE1 if the KEY1,KEY2, KEY3 have some values without looping in python? (some kind of lambda expression of c#)

I know thanks to @CommuSoft that I can use Xpath queries with libxml2. But when I try to install it using pip install libxml2-python I find the error

Could not find any downloads that satisfy the requirement libxml2-python

Also I forgot to mention I'm using python 2.7 on anaconda and windows.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
rlartiga
  • 429
  • 5
  • 21

2 Answers2

3

In general it is better to process XML with a library, and in this case specifically with an XPath query.

import libxml2

doc = libxml2.parseFile("tst.xml")
ctxt = doc.xpathNewContext()
res = ctxt.xpathEval("//REGISTER/ATTRIBUTE1[../KEY1/text()='TRUCK' and ../KEY2/text()='BLUE' and ../KEY3/text()='2014']")

doc.freeDoc()
ctxt.xpathFreeContext()

Here the query is:

//REGISTER/ATTRIBUTE1[../KEY1/text()='TRUCK' and ../KEY2/text()='BLUE' and ../KEY3/text()='2014']

The result is stored in res.

Community
  • 1
  • 1
Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
2

Use lxml with XPath.

import lxml.etree as etree

x = """<OUTPUT>
    <HEADER>

    </HEADER>
    <REGISTER>
        <RESULT>0</RESULT>
        <KEY1>CAR</KEY1>
        <KEY2>RED</KEY2>
        <KEY3>2013</KEY3>
        <ATTRIBUTE1>2000</ATTRIBUTE1>
        <ATTRIBUTE2>100000</ATTRIBUTE2>
    </REGISTER>
    <REGISTER>
        <RESULT>0</RESULT>
        <KEY1>TRUCK</KEY1>
        <KEY2>BLUE</KEY2>
        <KEY3>2014</KEY3>
        <ATTRIBUTE1>3000</ATTRIBUTE1>
        <ATTRIBUTE2>400000</ATTRIBUTE2>
    </REGISTER>
</OUTPUT>"""

tree = etree.fromstring(x)
xpath = "//REGISTER[./KEY1/text()='TRUCK' and ./KEY2/text()='BLUE' and ./KEY3/text()='2014']/ATTRIBUTE1"
for attribute1 in tree.xpath(xpath):
    print(attribute1.text)

Output:

3000
  • +1, but perhaps you can use the `/text()` for the conditions? Otherwise it's problematic if they have attributes as well... – Willem Van Onsem Apr 02 '15 at 12:59
  • `.text()` is not accepted by lxml XPath. This `./KEY3.text()='2014'` gives `XPathEvalError: Invalid expression` :( –  Apr 02 '15 at 13:03
  • 1
    It's not `.text()`, but `/text()` (with a slash). – Willem Van Onsem Apr 02 '15 at 14:26
  • @CommuSoft I'm trying this solution as well, there is a way to obtaining the value if I know there is one registry with that properties? – rlartiga Apr 02 '15 at 17:07
  • @rlartiga: I don't understand the question... gramatically it doesn't make much sense, please reformulate? – Willem Van Onsem Apr 02 '15 at 17:09
  • @Commusoft Ok he does a for because we can have more than a value or not? What if I know there is unique value – rlartiga Apr 02 '15 at 17:11
  • @rlartiga: I think there is an iterator version where you thus can break off the search after the first found match. But I'm not familiar with this library. – Willem Van Onsem Apr 02 '15 at 17:23