5

Here is my code:

from lxml import html
import requests

page = requests.get('https://en.wikipedia.org/wiki/Nabucco')
tree = html.fromstring(page.content)
title = tree.xpath('//*[@id="mw-content-text"]/table[1]/tbody/tr[1]/th/i')
print(title)

Problem: print(title) prints "[]", empty list. I expect this to print "Nabucco". The XPath expression is from Chrome inspector "Copy XPath" function.

Why isn't this working? Is there a disagreement between lxml and Chrome's xpath engine? Or am I missing something? I am somewhat new to python, lxml and xpath.

noctonura
  • 12,763
  • 10
  • 52
  • 85
  • Possible duplicate of [Why does this xpath fail using lxml in python?](http://stackoverflow.com/questions/23900348/why-does-this-xpath-fail-using-lxml-in-python) –  Jan 28 '16 at 20:48

1 Answers1

8

That's because of the tbody tag. You see it in the browser since the tag was inserted by the browser. requests is not a browser and just downloads the page source as is:

Replace:

//*[@id="mw-content-text"]/table[1]/tbody/tr[1]/th/i

with:

//*[@id="mw-content-text"]/table[1]/tr[1]/th/i
Community
  • 1
  • 1
alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
  • 2
    Or just use `//*[@id="mw-content-text"]/table[1]//tr[1]/th/i` and cover both cases. (Replace `/tbody` with `//`.) – kjhughes Nov 14 '15 at 18:58