24

Which one would you choose? My important attributes are (not in order):

  1. Support and future enhancements.
  2. Community and general knowledge base (on the Internet).
  3. Comprehensive (I.E., proven to parse a wide range of *.*ml pages).
  4. Performance.
  5. Memory footprint (runtime, not the code-base).
the Tin Man
  • 158,662
  • 42
  • 215
  • 303
roshan
  • 1,323
  • 18
  • 31

3 Answers3

36

Pick Nokogiri, for all points and especially point one: Hpricot is no longer maintained.

Meta answer: See ruby-toolbox to get an idea of the popularity of different tools in a given area.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
Marc-André Lafortune
  • 78,216
  • 16
  • 166
  • 166
  • update - hpricot is no longer being maintained, which makes the choice even easier. – jsh Jan 15 '13 at 23:05
8

Only pick Hpricot if you don't have, or can't install, LibXML on the computer you're using. If this is not the case then choose Nokogiri, it's better in the five mentioned attributes than Hpricot.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
SztupY
  • 10,291
  • 8
  • 64
  • 87
  • 7
    Since literally a couple of minutes ago, there's also a pure-Java version of Nokogiri. So, you can use Nokogiri on JRuby without FFI and without libxml. (Google App Engine is one example where FFI is not possible.) – Jörg W Mittag May 22 '10 at 19:47
6

The case where I've found Hpricot to be useful is in dealing with broken HTML that you need to remain broken after processing. Hpricot is good about modifying only the portion of a document you have updated. Unless this is needed, Nokogiri is the way to go.

nil
  • 1,192
  • 9
  • 12