I am trying to set a Scrapy selector to fetch some data on a table from Trezor's supported coins page (https://trezor.io/coins/):
In [1]: import requests
...: from scrapy.selector import Selector
...: req = requests.get('https://trezor.io/coins/').content
...: xs = '//*[@id="content"]/tr'
...: sel = Selector(text=req).xpath(xs)
In [2]: sel.extract_first()
Out[2]: '<tr class="coin " data-href="./#BTC" id="BTC"></tr>'
Shouldn't the selector bring the tr
element and everything that is inside it (in this case, six td
elements with more inner elements? When I try to access the td
elements manually (with either xs = '//*[@id="content"]/tr[1]/td'
or xs = '//*[@id="content"]/tr[1]/td[1]'
), all I get is an empty list. I have also tried getting child nodes, but to no avail.
Cf. extracting on Wikipedia's main page, where you get everything inside the specified container:
In [3]: req2 = requests.get('https://en.wikipedia.org/wiki/Main_Page').content
...: xd = '//*[@id="mp-welcomecount"]'
...: sel2 = Selector(text=req2).xpath(xd)
In [4]: sel2.extract_first()
Out[4]: '<div id="mp-welcomecount">\n<div id="mp-welcome">Welcome to <a href="/wiki/Wikipedia" title="Wikipedia">Wikipedia</a>,</div>\n<div id="mp-free">the <a href="/wiki/Free_content" title="Free content">free</a> <a href="/wiki/Encyclopedia" title="Encyclopedia">encyclopedia</a> that <a href="/wiki/Help:Introduction" title="Help:Introduction">anyone can edit</a>.</div>\n<div id="articlecount"><a href="/wiki/Special:Statistics" title="Special:Statistics">6,088,421</a> articles in <a href="/wiki/English_language" title="English language">English</a></div>\n</div>'
Why is that on Trezor's case I only get the tr
element and how do I correct my code to bring everything that is contained inside it?