How to extract the value of an HTML attribute using scrapy response.xpath?

Question

I'm trying to extract the value of the attribute data-asin-price inside a <div> tag

Which in the example below you can see is 22.63

<div id="cerberus-data-metrics" style="display: none;" data-asin="B079GMRZ8S" data-asin-price="22.63" data-asin-shipping="0.0" data-asin-currency-code="AUD" data-substitute-count="-1" data-device-type="WEB" data-display-code="Asin is not eligible because it is not enabled"></div>

Is there any way to do this using response.xpath() with scrapy?

Thank you

Possible duplicate of [Extract value of attribute node via XPath](https://stackoverflow.com/questions/4835891/extract-value-of-attribute-node-via-xpath) — Andersson, Nov 09 '18 at 06:08

score 3 · Accepted Answer · edited Nov 10 '18 at 11:33

3

I just wanted to post the answer I found.

To get the 22.63 value our of the data-asin-price attribute in scrapy shell I did the following:

response.xpath('//div[@id = "cerberus-data-metrics"]/@data-asin-price').extract_first()

Cheers

edited Nov 10 '18 at 11:33

pwinz

303
2
14

answered Nov 09 '18 at 01:27

Jackknife

105
1
9

Nice work answering your own question. For clarity, I edited your question and your answer. You are actually trying to extract the value of an attribute, not an element. An HTML/XML element is the tag and everything in it. The attributes are things like `style`, `id`, and `href` and their values that appear in the opening portion of the tag. Knowing this will help you find answers in the future. – pwinz Nov 09 '18 at 19:00
@pwinz thank you for the edit and clarification. It will definitely help me in the future. Regards – Jackknife Nov 10 '18 at 08:46

score 0 · Answer 2 · answered Apr 17 '23 at 22:40

0

In the current version of scrapy (v2.8), you can also use its built-in extensions to CSS selectors. These extensions may also be available in earlier version of scrapy.

response.css("div::attr(data-asin-price)").get()

In its generic form, replace CSS_SELECTOR and ATTRIBUTE_NAME as needed.

response.css("CSS_SELECTOR::attr(ATTRIBUTE_NAME)").get()

answered Apr 17 '23 at 22:40

thehale

913
10
18

NOTE: I recognize that the OP asked specifically about XPath's. This question, however, was the top Google result when I generically searched for extracting HTML attributes using `scrapy`, and I find this CSS answer *much* easier to read. – thehale Apr 17 '23 at 22:40

How to extract the value of an HTML attribute using scrapy response.xpath?

2 Answers2