5

I'm writing a scraper in node. Is there a module out there that will allow me to work with css selectors?

BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
George Mauer
  • 117,483
  • 131
  • 382
  • 612

4 Answers4

5

Look at the excellent jsdom, and specifically this section, where it shows how you can leverage jQuery in Node to scrape HTML documents, thus using the CSS-like selectors that jQuery offers.

Domenic
  • 110,262
  • 41
  • 219
  • 271
matehat
  • 5,214
  • 2
  • 29
  • 40
  • Strictly speaking, as it leverages jQuery it makes use of jQuery selectors, so [some CSS3 selectors may not be available](http://stackoverflow.com/questions/11745274/what-css3-selectors-does-jquery-really-support-e-g-nth-last-child), unless the native Selectors API is supported somehow... – BoltClock Jan 12 '13 at 05:09
4

Cheeriojs is also a good alternative. It uses Jquery selectors engine.

1

If all you need is a module that allows you to use CSS selectors, I'd suggest having a look at sel.

It supports CSS4 (!) selectors already (at least in parts), is smaller then Sizzle (jQuery's selector engine), and focuses on one task instead of doing everything somehow.

Golo Roden
  • 140,679
  • 96
  • 298
  • 425
  • cool, I'm looking for server-side only but still cool – George Mauer Jan 12 '13 at 05:54
  • Can you post example code? For me, `node -r sel` crashes already, because it references an undefined variable [`document` in line 8](https://github.com/amccollum/sel/blob/addab28b60aa9ee217cfe9be63d58eb589b133c1/lib/sel.js#L8). This leads me to believe sel is intended to be used only in browsers, not node.js. – phihag May 09 '17 at 21:25
0

soupselect would be another choice. I was seeking a nodejs library which works like nokogiri in Ruby. You can perform css query not only over document but also over html fragments. soupselect is only a selector so, you also need htmlparser2 for parsing.

9re
  • 2,408
  • 27
  • 27