On the website http://www.apkmirror.com/apk/redditinc/reddit/reddit-1-5-5-release/reddit-1-5-5-android-apk-download/, I'm trying to extract the lines containing the Min:
and Target:
versions of Android (see screenshot below).
In the Scrapy shell, so far I've come up with the XPath expression
In [1]: android_version = response.xpath('//*[@title="Android version"]/following-sibling::*[@class="appspec-value"]')
such that if I concatenate with .//text()
and extract()
, I get several lines including the ones I want:
In [2]: android_version_text = android_version.xpath('.//text()').extract()
In [3]: android_version_text
Out[3]:
[u'\n',
u'Min: Android 4.0.3 (Ice Cream Sandwich MR1, API 15) ',
u'\n',
u'Target: Android 6.0 (Marshmallow, API 23)',
u'\n']
I would now like to refine the XPath expression to get only fields with text()
containing "Min:"
or "Target:
. Following XPath contains(text(),'some string') doesn't work when used with node with more than one Text subnode, I've tried
In [7]: android_version.xpath('.//*[contains(text(), "Min:"]')
but this gives rise to a
ValueError: XPath error: Invalid expression in .//*[contains(text(), "Min:"]
How could I construct an XPath expression to get only the Min:
line, for example?