On the website http://www.apkmirror.com/apk/opera-software-asa/opera-mini/opera-mini-28-0-2254-119213-release/opera-mini-fast-web-browser-28-0-2254-119213-2-android-apk-download/, I'm trying to extract several fields from the same XPath selector using Item Loaders. To avoid code repetition, I'd like to use the nested_xpath method.
To this end, I would like a relative XPath selector that is essentially a 'no-op' and gives you back the input selection. I thought should be .//*
, but this does not seem to work.
If I start the Scrapy shell with
scrapy shell http://www.apkmirror.com/apk/opera-software-asa/opera-mini/opera-mini-28-0-2254-119213-release/opera-mini-fast-web-browser-28-0-2254-119213-2-android-apk-download/ -s USER_AGENT=Mozilla
Then the following XPath expression gives me the desired result:
In [2]: response.xpath('//*[@title="APK details"]/following-sibling::*//text()')
...: .extract()
Out[2]:
['Version: 28.0.2254.119213 (281119213)',
'arm ',
'Package: com.opera.mini.native',
'\n',
'183 downloads ']
However, if I try to concatenate this with .xpath('.//*')
the result becomes an empty list:
In [3]: response.xpath('//*[@title="APK details"]/following-sibling::*//text()')
...: .xpath('.//*').extract()
Out[3]: []
What would be the correct 'no-op' XPath selector in this case?