6

I have a web page in which there are some JS APIs that don't alter the dom, but return some numbers. I'd like to write a NodeJS application that downloads such pages and executes those functions in the context of the downloaded page.

I was looking at cheerio for page scraping.. but while I see how easy is it to navigate and manipulate the DOM with it, I don't see any access to running the page functions. Is it possible to do it?

Should I look, instead, at jsdom?

ggorlen
  • 44,755
  • 7
  • 76
  • 106
Tonyx
  • 61
  • 1
  • 2
  • [this](http://stackoverflow.com/a/7978072/2172543) is the best SO answer I've found so far about your question. It's not strictly about executing web pages javascript, is about HTML parsing. – Marcel Mar 24 '13 at 17:43

2 Answers2

5

Sounds like you want to use PhantomJS, which will provide the fully rendered output, and then use cheerio on that.

Mark Selby
  • 597
  • 5
  • 7
0

Cheerio and jsdom are both HTML scrapers and have no notion of executing JavaScript. If the API you wish to access is written in JavaScript, there is little to prevent you from extracting them and running them inside node. Beware though, downloading/executing arbitrary JavaScript can pose a huge security risk. If you want to simulate the behaviour of a browser, look at http://phantomjs.org/. This is a headless browser for Node and can do everything an ordinary browser can as well.

Deathspike
  • 8,582
  • 6
  • 44
  • 82
  • 1
    Note that if you do want to run JS safely in Node, it's perfectly doable via the `vm` module that has a `runInContext` method that is completely isolated from the rest of your code (but can still hog resources). – Benjamin Gruenbaum May 11 '14 at 20:32
  • 4
    jsdom **is not** just an HTML scraper with no notion of executing JavaScript. See the docs: [Initialization lifecycle](https://github.com/tmpvar/jsdom/blob/master/README.md#initialization-lifecycle) and [For the hardcore: jsdom.jsdom](https://github.com/tmpvar/jsdom/blob/master/README.md#for-the-hardcore-jsdomjsdom) – rsp Jul 30 '14 at 23:39