7

Is there any API for Node.js to get and query html from URLs and static html?

I like to do something like this to use with webscrape:

Document doc = Jsoup.connect("http://en.wikipedia.org/").get();
Elements newsHeadlines = doc.select("#mp-itn b a");

I have a look at this Question and looked most of those APIs, but I haven't found (perhaps I couldn't identify) anything so similar.

alexpfx
  • 6,412
  • 12
  • 52
  • 88

1 Answers1

7

Jsdom is probably what you want https://github.com/tmpvar/jsdom You can use it in combination with jquery to query the dom. Here's an example on how I've been using it on one of my projects https://github.com/gabesoft/seryth/blob/master/lib/sanitizer.js You'll probably also need request to get the html from urls https://github.com/request/request

gabesoft
  • 1,228
  • 9
  • 6