XML searching python

Question

I have a XML of the next type:

<OUTPUT>
    <HEADER>

    </HEADER>
    <REGISTER>
        <RESULT>0</RESULT>
        <KEY1>CAR</KEY1>
        <KEY2>RED</KEY2>
        <KEY3>2013</KEY3>
        <ATTRIBUTE1>2000</ATTRIBUTE1>
        <ATTRIBUTE2>100000</ATTRIBUTE2>
    </REGISTER>
    <REGISTER>
        <RESULT>0</RESULT>
        <KEY1>TRUCK</KEY1>
        <KEY2>BLUE</KEY2>
        <KEY3>2014</KEY3>
        <ATTRIBUTE1>3000</ATTRIBUTE1>
        <ATTRIBUTE2>400000</ATTRIBUTE2>
    </REGISTER>
<OUTPUT>

How can I search the value of the ATTRIBUTE1 if the KEY1,KEY2, KEY3 have some values without looping in python? (some kind of lambda expression of c#)

I know thanks to @CommuSoft that I can use Xpath queries with libxml2. But when I try to install it using pip install libxml2-python I find the error

Could not find any downloads that satisfy the requirement libxml2-python

Also I forgot to mention I'm using python 2.7 on anaconda and windows.

@CommuSoft could you give me an example with the xml I gave? For example how search the ATTRIBUTE1 value if key1=TRUCK, KEY2=BLUE,KEY3=2014? — rlartiga, Apr 02 '15 at 12:50

score 3 · Accepted Answer · edited May 23 '17 at 12:28

3

In general it is better to process XML with a library, and in this case specifically with an XPath query.

import libxml2

doc = libxml2.parseFile("tst.xml")
ctxt = doc.xpathNewContext()
res = ctxt.xpathEval("//REGISTER/ATTRIBUTE1[../KEY1/text()='TRUCK' and ../KEY2/text()='BLUE' and ../KEY3/text()='2014']")

doc.freeDoc()
ctxt.xpathFreeContext()

Here the query is:

//REGISTER/ATTRIBUTE1[../KEY1/text()='TRUCK' and ../KEY2/text()='BLUE' and ../KEY3/text()='2014']

The result is stored in res.

edited May 23 '17 at 12:28

Community

1
1

answered Apr 02 '15 at 12:53

Willem Van Onsem

443,496
30
428
555

`libxml2-python` is a binding, did you install `sudo apt-get install libxml2-dev libxslt1-dev` first? – Willem Van Onsem Apr 02 '15 at 14:24
I'm using windows (sudo is not for windows as far as I know) what command I have to execute? – rlartiga Apr 02 '15 at 14:28
@rlartiga: does [this](http://stackoverflow.com/questions/3520826/installing-libxml2-on-python-2-7-windows) help? – Willem Van Onsem Apr 02 '15 at 14:31
@rlartiga: can you update your question with the error you get with `pip`? – Willem Van Onsem Apr 02 '15 at 14:54
I deleted my comments and updated the answer, thanks – rlartiga Apr 02 '15 at 14:59
@rlartiga: can you install it with these [executables](http://users.skynet.be/sbi/libxml-python/). – Willem Van Onsem Apr 02 '15 at 15:14
Thanks, sorry for the inconvenience – rlartiga Apr 02 '15 at 15:17

score 2 · Answer 2 · 2015-04-02T14:27:02.030

2

Use lxml with XPath.

import lxml.etree as etree

x = """<OUTPUT>
    <HEADER>

    </HEADER>
    <REGISTER>
        <RESULT>0</RESULT>
        <KEY1>CAR</KEY1>
        <KEY2>RED</KEY2>
        <KEY3>2013</KEY3>
        <ATTRIBUTE1>2000</ATTRIBUTE1>
        <ATTRIBUTE2>100000</ATTRIBUTE2>
    </REGISTER>
    <REGISTER>
        <RESULT>0</RESULT>
        <KEY1>TRUCK</KEY1>
        <KEY2>BLUE</KEY2>
        <KEY3>2014</KEY3>
        <ATTRIBUTE1>3000</ATTRIBUTE1>
        <ATTRIBUTE2>400000</ATTRIBUTE2>
    </REGISTER>
</OUTPUT>"""

tree = etree.fromstring(x)
xpath = "//REGISTER[./KEY1/text()='TRUCK' and ./KEY2/text()='BLUE' and ./KEY3/text()='2014']/ATTRIBUTE1"
for attribute1 in tree.xpath(xpath):
    print(attribute1.text)

Output:

edited Apr 02 '15 at 14:27

answered Apr 02 '15 at 12:52

+1, but perhaps you can use the `/text()` for the conditions? Otherwise it's problematic if they have attributes as well... – Willem Van Onsem Apr 02 '15 at 12:59
`.text()` is not accepted by lxml XPath. This `./KEY3.text()='2014'` gives `XPathEvalError: Invalid expression` :( – Apr 02 '15 at 13:03
1

It's not `.text()`, but `/text()` (with a slash). – Willem Van Onsem Apr 02 '15 at 14:26
@CommuSoft I'm trying this solution as well, there is a way to obtaining the value if I know there is one registry with that properties? – rlartiga Apr 02 '15 at 17:07
@rlartiga: I don't understand the question... gramatically it doesn't make much sense, please reformulate? – Willem Van Onsem Apr 02 '15 at 17:09
@Commusoft Ok he does a for because we can have more than a value or not? What if I know there is unique value – rlartiga Apr 02 '15 at 17:11
@rlartiga: I think there is an iterator version where you thus can break off the search after the first found match. But I'm not familiar with this library. – Willem Van Onsem Apr 02 '15 at 17:23

XML searching python

2 Answers2