0

I'm trying crawl a website for a bot I'm working on, anyway; I'm not too experienced with xpath and right now I can get some information but from the website that I'm crawling there are guides (Like guides for videogames) (It's a game) and I want to get the title of the guide but it doesn't output anything; I'll explain my code:

name = input("> ")

page = requests.get("http://www.mobafire.com/league-of-legends/champions")
tree = html.fromstring(page.content)

for index, champ in enumerate(champ_list):
    if name == champ:
        y = tree.xpath(".//*[@id='browse-build']/a[{}]/@href".format(index + 1))
        print(y)


guide = requests.get("http://www.mobafire.com/league-of-legends/champion/ashe-13")
builds = html.fromstring(guide.content)
print(builds)

for title in builds.xpath(".//*[@id='browse-build']/table/tbody/tr[1]/td/text()"):
    print(title)

Ok, from the input it searched a list and from said list it extracts a link which would go on the guide variable; from that I want to crawl for the title of the first guide but it doesn't output anything. I get a status code 200 so I know everything is fine with the url and that. I tried nesting this:

guide = requests.get("http://www.mobafire.com/league-of-legends/champion/ashe-13")
builds = html.fromstring(guide.content)
print(builds)

for title in builds.xpath(".//*[@id='browse-build']/table/tbody/tr[1]/td/text()"):
    print(title)

Inside the for loop above but it doesn't do anything neither; literally, it just finishes the program; there you can see the site where I'm getting the information from and that's it; I don't know what would be the right approach to this, if there is anything else I should add please tell me. Thanks for any help.

Aguxez
  • 378
  • 5
  • 16
  • If it's not printing anything, that means it didn't find any elements matching that xpath. I see that page has two `
    `s with `id="browse-build"`, but ids are supposed to be unique. Perhaps that is confusing your search?
    – John Gordon Nov 26 '16 at 17:19
  • @JohnGordon Yeah, I see it now, that was the problem, thanks! – Aguxez Nov 26 '16 at 18:03

2 Answers2

1

The site has a namespace defined (xmlns="http://www.w3.org/1999/xhtml"). You have to add that namespace at these xpath. For more info visit this. Xml Namespace breaking my xpath!

Community
  • 1
  • 1
Saurav Shanu
  • 171
  • 1
  • 4
1

As noted in comments, an id must be unique. The first of these constructions works. The fact that the code doesn't actually contain a tbody might explain why the second doesn't.

>>> for item in builds.xpath(""".//table[@class='browse-table']/tr[1]/td/text()"""):
...     item
...     
'Season 7 Guides'

>>> for item in builds.xpath(""".//table[@class='browse-table']/tbody/tr[1]/td/text()"""):
...     item
... 

I don't know this provides a path to the results you want, however, since you didn't specify them.

Bill Bell
  • 21,021
  • 5
  • 43
  • 58