0

I am trying to get my head around regex in more detail..

I am trying to extract each paragraph in the following html page:

simbatish
  • 123
  • 1
  • 1
  • 5
  • 3
    [You shouldn't try to parse HTML with RegEx](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) – Bohemian Aug 14 '11 at 07:49
  • 1
    Use [HTML::Parser](http://search.cpan.org/dist/HTML-Parser/), don't waste your time trying to come up with a fragile home-brew parser. – mu is too short Aug 14 '11 at 08:02

1 Answers1

1

You could also take a look at pQuery (it's a Perl port to jQUery) and I found it extremely useful.

Dimitar Petrov
  • 667
  • 1
  • 5
  • 16
  • [HTML::Query](http://p3rl.org/HTML::Query) and [Web::Query](http://p3rl.org/Web::Query) are better. – daxim Aug 24 '11 at 09:29