Xpath get text of nested item not working but css does

Question

I'm making a crawler with Scrapy and wondering why my xpath doesn't work when my CSS selector does? I want to get the number of commits from this html:

<li class="commits">
    <a data-pjax="" href="/samthomson/flot/commits/master">
        <span class="octicon octicon-history"></span>
        <span class="num text-emphasized">
          521
        </span>
        commits
    </a>
  </li

Xpath:

response.xpath('//li[@class="commits"]//a//span[@class="text-emphasized"]//text()').extract()

CSS:

response.css('li.commits a span.text-emphasized').css('::text').extract()

CSS returns the number (unescaped), but XPath returns nothing. Am I using the // for nested elements correctly?

Sicco · Accepted Answer · 2015-09-19T12:39:16.263

1

You're not matching all values in the class attribute of the span tag, so use the contains function to check if only text-emphasized is present:

response.xpath('//li[@class="commits"]//a//span[contains(@class, "text-emphasized")]//text()')[0].strip()

Otherwise also include num:

response.xpath('//li[@class="commits"]//a//span[@class="num text-emphasized"]//text()')[0].strip()

Also, I use [0] to retrieve the first element returned by XPath and strip() to remove all whitespace, resulting in just the number.

edited Sep 19 '15 at 12:39

answered Sep 19 '15 at 12:25

Sicco

6,167
5
45
61

Thanks. I thought by specifying the text-emphasized class I was narrowing it down from all spans. Do you know why [@class="text-emphasized"] didn't work? For example is the [@class="commits"] pointless ? – S.. Sep 19 '15 at 12:39
1

In XPath, adding more specific details narrows down your search. So you don't need to specify any details if a generic search is enough. E.g., these queries also work: `response.xpath('/li/a/span/text()')[0].strip()` or `response.xpath('//span/text()')[0].strip()`. Note that `//` searches through all descendants, while `/` only searches in the direct descendants. – Sicco Sep 19 '15 at 12:56

Xpath get text of nested item not working but css does

1 Answers1