-1

I am working on a simple Ruby script to parse the names of race horses from a webpage. This Regex works on http://rubular.com/, but my script does not print anything when I run it.

require 'open-uri';

url = "http://www.bloodhorse.com/horse-racing/race/race-results";
connection = open(url);
content = connection.read;

if(content =~ /(<span class="horseName">)(\n)(.*?)(\>)(.*?)(<\/a>)/)
    print $5,"\n";
end

An example of some of the page's source is:

<li value="2">
<span class="horseName">
<a href="/horse-racing/thoroughbred/felonious-fred/2010">Felonious Fred</a>

So I would think that my script should return the 5th capture of the matching Regex, which in this case should be "Felonious Fred". What am I doing wrong?

  • 2
    If I were you, I would use something like Nokogiri. – Marek Lipka Oct 23 '13 at 10:08
  • 2
    I feel it necessary to link this immortal answer from the Java section of SO : http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – mcfinnigan Oct 23 '13 at 10:10

1 Answers1

0

If you are scrapping a webpage, I suggest you use Nokogiri gem. Will save you the trouble of Regex.

JunaidKirkire
  • 878
  • 1
  • 7
  • 17