Can someone please tell me how to convert this line of Javascript to Ruby using Hpricot & RegEx?
// Replace all doubled-up <BR> tags with <P> tags, and remove fonts.
var pattern = new RegExp ("<br/?>[ \r\n\s]*<br/?>", "g");
document.body.innerHTML = document.body.innerHTML.replace(pattern, "</p><p>").replace(/<\/?font[^>]*>/g, '');
The code I have setup is:
require 'rubygems'
require 'hpricot'
require 'open-uri'
@file = Hpricot(open("http://www.bubl3r.com/article.html"))
Thanks
` tag was meant to be used.
– NullUserException Aug 09 '10 at 00:51` tags, right? What if you hit code like `
bar
` is semantic, not presentational, and there's nothing wrong with `
– Justin Morgan - On strike Feb 24 '11 at 23:20` anyway. At the very least, avoid using regex for parsing HTML: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454