-1

I have this block of xml.

    <procedure>
        <name1>first</name1>
        <name2>second</name2>
        <name3>third</name3>
    </procedure>

How can I retrieve the values (first, second, third - this I can do) and attributes (name1, name2, name3) as an XPath Expression?

Thank you

Quick
  • 1
  • 4

2 Answers2

1

To retrieve values 'first', 'second' and 'third' you can use below x-path.

//procedure/*

Also you can get each value separately.

//procedure/name1
//procedure/name2
//procedure/name3

name1, name2, name3 are the nodes and these are not attributes. You can use below x-path to get node name.

//procedure/*/name()
Amol Patil
  • 23
  • 5
0

Using python's module lxml:

import lxml.html

if __name__ == '__main__':
    text = """<procedure>
        <name1>first</name1>
        <name2>second</name2>
        <name3>third</name3>
    </procedure>"""

    root = lxml.html.fromstring(text)

    for item in root.xpath('./descendant::node()'):
        try:
            if item.strip():
                print(item)

        except AttributeError:
            print(item.tag)

Pay attention to the xpath expression used, as noted in this SO answer.

The code above prints:

name1
first
name2
second
name3
third

You just have to save it in a .py file and execute it with: python3 script.py.

Kfcaio
  • 442
  • 1
  • 8
  • 20