0

So far this captures everything i need ending with 'em' i need regex to capture paragraphs ending in 'ppp' also.

My regex:

%<h2>Storyline</h2>(.*)em%s
el_pup_le
  • 11,711
  • 26
  • 85
  • 142

1 Answers1

1

I would advise not to parse HTML with regex, but this seems easy enough seeing as you aren't actually parsing it as HTML...

%<h2>Storyline</h2>(.*?)(?:em|ppp)%s
BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
  • Why shouldn't HTML be parsed with regex? – el_pup_le Jan 09 '11 at 11:44
  • 2
    If your HTML consists of a very simple format or one-liner then there is nothing wrong with using regex. However if the structure is unpredictable or large, you're much better off using a parser, like `DOMDocument`, which will handle all the parsing of the markup for you so you can focus on getting the information from the markup. – BoltClock Jan 09 '11 at 11:46
  • 2
    @aLk see [Best Methods to parse HTML](http://stackoverflow.com/questions/3577641/best-methods-to-parse-html/3577662#3577662) – Gordon Jan 09 '11 at 11:53