xpath google sheet importxml

Question

this is the link i want to do my importxml on

https://sg.finance.yahoo.com/quote/D05.SI

<div class="dot-label Pos(r) Fz(13px) Fw(500) D(ib) Ta(c)"><span>Current</span><!-- react-text: 18 -->&nbsp;<!-- /react-text --><span>25.08</span></div>

i would like to use importxml to retrieve the value 25.08.

=IMPORTXML("https://sg.finance.yahoo.com/quote/D05.SI","//*[@class='dot-label Pos(r) Fz(13px) Fw(500) D(ib) Ta(c)']//span")

but it always return #NA, please advise the correct syntax or link and the reasons so i can have some understanding.

Which value you want to retrieve? Since it is dynamic class, the `class` attribute changes anytime, you need to use another scraping technique to retrieve the data. — Ben, Dec 12 '17 at 10:00
i need to retrieve the current price. can you advise the name of another scaping technique and method? — OOI YI YONG, Dec 12 '17 at 11:49
Does this answer your question? [Scraping data to Google Sheets from a website that uses JavaScript](https://stackoverflow.com/questions/74237688/scraping-data-to-google-sheets-from-a-website-that-uses-javascript) — Rubén, Jan 05 '23 at 02:01

score 1 · Answer 1 · answered Dec 12 '17 at 12:22

1

Using id attribute selector is always safer than using class attribute selector, because they are using class binding and it gets changed frequently. I found the closest DOM that has id attribute, and lookup until the current price is scraped.

You should try this xpath:

=INDEX(IMPORTXML("https://sg.finance.yahoo.com/quote/D05.SI","//div[@id='quote-header-info']/div[last()]/div[1]"),1)

But sometimes the layout / DOM structure get changed, you might need to check and update the xpath to ensure that it correctly scrape the value that you wanted.

answered Dec 12 '17 at 12:22

Ben

5,069
4
18
26

wow, it worked perfectly but i dont quite understood the code. may i know the purpose of using Index? also div[last()] is to get the last div of @id='quote-header-info' am i right? is there any link which has all documentation on these syntax? for the last(), first()? how many more are there? for div[1] it worked like an array? but start with 1 instead of 0? how did the code reaches the span class and retrieve the text? – OOI YI YONG Dec 12 '17 at 12:58
For div[last()], it selects the last child of the div[@id='quote-header-info']. For div[1], it starts with 1 and meaning the first child. And it has several matched values returned for the given xpath, the first value is which we are interested. – Ben Dec 12 '17 at 13:40

xpath google sheet importxml

1 Answers1

Linked