0

I have the variable $Contents that contains the contents of a webpage and I need to pull out the following:

Start: <div class="XXXXX">

End: <div role="XXXXX"'

The string represented by YYYYY could be numbers, characters, spaces, quotes and pretty much anything that exists on a modern keyboard.

Currently I am using this:

preg_match("/<div class=\"XXXXX\">(.*)<div role=\"XXXXX\"/", $Contents, $match);
echo "<p>Event Title: $match[1]</p>";

But getting nothing so I assume it's my regex that's the issue. Can anyone help?

elixenide
  • 44,308
  • 16
  • 74
  • 100
Harlequin
  • 1
  • 1

1 Answers1

0

I'm assuming the second XXXXX should be YYYYY, or maybe you just mean it could be any string.

First, you really should use a parser instead of regex for this. See this classic, sad tale for the reason why.

Second, to answer your question: add a ? after .* and use s after the final slash to match across lines, like this:

$Contents = '<div class="XXXXX">
    foo bar
    <div role="alacadabra">baz';
preg_match("/<div class=\"XXXXX\">(.*)<div role=\".+\"/s", $Contents, $match);
echo "<p>Event Title: $match[1]</p>"; // outputs foo bar
Community
  • 1
  • 1
elixenide
  • 44,308
  • 16
  • 74
  • 100