1

I would like to fetch part of web page with yql. I have tried several queries. Most of the queries can return the correct result except one query.

Here is the query:

select * from html where url="http://www.cngold.org/img_date/livesilvercn_rmb.html" and xpath='//div[6]/div[2]/div/div[2]/table/tbody/tr[4]/td[6]'

I hope to get the price but actually get the empty result.

If I retrieve the whole page with yql and check the xpath of that element, this time the xpath is

//div[3]/div/div[2]/a/div/div[2]/table/tbody/tr[4]/td[6]

Why there are so many differences?

How should I handle the situation?

Thanks in advance.

Flame_Phoenix
  • 16,489
  • 37
  • 131
  • 266
bucherren
  • 299
  • 1
  • 3
  • 13
  • No idea what's on the page, but your first query selects the `5.82` value (I hope it's actually a useful information and the values don't change very often). The second query doesn't get me anything (but it is a valid query). – Petr Janeček May 21 '12 at 20:42
  • Yes, 5.82 is a useful information. But I got an empty result. I try it with yql console. Thank you. Perhaps I should ask another person to try it. – bucherren May 24 '12 at 01:48
  • I tried it with Firefox + Firebug + [Firefinder](https://addons.mozilla.org/cs/firefox/addon/firefinder-for-firebug/). Isn't the problem that the value gets computed after the page is loaded with some javascript? Because the original intact source file doesn't contain the value. And yql can't find what's computed with js, of course. – Petr Janeček May 24 '12 at 07:40

1 Answers1

0

YQL cannot get values that are computed dynamically. In that case, you are better off using phantom.js.

This answer https://stackoverflow.com/a/7978072/1337392 provides several tools with which you can do HTML scrapping.

Hope it helps!

Community
  • 1
  • 1
Flame_Phoenix
  • 16,489
  • 37
  • 131
  • 266