I'm looking for a forgiving HTML parser for scraping HTML and extracting data in Ruby. I've had success using BeautifulSoup for this - what is the ruby equivalent?
Asked
Active
Viewed 2,562 times
2 Answers
6
Also see: Nokogiri vs Hpricot before making a choice. Nokogiri seems to outdo hpricot performance-wise (haven't benchmarked myself) and has a nice syntax IMO.

Community
- 1
- 1

Uku Loskit
- 40,868
- 9
- 92
- 93
-
Thank you. I used Nokogiri and it was sufficient for my purposes. I think the HTML I through at it was well-formed, so I have researched how fault tolerant it is. – Adam Loving Sep 16 '10 at 17:34
-
2Update for 2013: the Hpricot readme on github says it is no longer maintained and recommends Nokogiri instead. – antinome Jan 09 '13 at 02:33
0
There was a Rubyful Soup gem, which was a Ruby port of BeautifulSoup, but it's no longer maintained and their site now recommends hpricot.

Daniel Vandersluis
- 91,582
- 23
- 169
- 153