Can I parse xpath using python, selenium and lxml?

Question

So I have been trying to figure our how to use BeautifulSoup and did a quick search and found lxml can parse the xpath of an html page. I would LOVE if I could do that but the tutorial isnt that intuitive.

I know how to use Firebug to grab the xpath and was curious if anyone has use lxml and can explain how I can use it to parse specific xpath's, and print them.. say 5 per line..or if it's even possible?!

Selenium is using Chrome and loads the page properly, just need help moving forward.

Thanks!

Okay. To use xpath on xml docs with python, see element tree http://docs.python.org/2/library/xml.etree.elementtree.html#xpath-support . You may not be able to parse all html docs right off the web as they may not be all valid xml docs. See http://stackoverflow.com/questions/285990/parse-html-via-xpath — Himanshu, Dec 20 '12 at 05:31

score 1 · Answer 1 · answered Dec 20 '12 at 07:29

lxml's ElementTree has a .xpath() method (note that the ElementTree in the xml package in the Python distribution dosent have that!)

e.g.

# see http://lxml.de/xpathxslt.html

from lxml import etree

# root = etree.parse('/tmp/stack-overflow-questions.xml')
root = etree.XML('''
        <answers>
            <answer author="dlam" question-id="13965403">AAA</answer>
        </answers>
''')

all_answers = root.xpath('.//answer')

for i, answer in enumerate(all_answers):
    who_answered = answer.attrib['author']
    question_id = answer.attrib['question-id']
    answer_text = answer.text
    print 'Answer #{0} by {1}: {2}'.format(i, who_answered, answer_text)

score 0 · Answer 2 · answered Oct 27 '16 at 05:26

I prefer to use lxml. Because the efficiency of lxml is more higher than selenium for large elements extraction. You can use selenium to get source of webpages and parse the source with lxml's xpath instead of the native find_elements_with_xpath in selenium.

Can I parse xpath using python, selenium and lxml?

2 Answers2

Linked