0

I just cannot get regex to work when start tag and ending tag are in different rows.

Start tag should be <p class="psku"> and

ending </span></p>

<p class="psku">Number: rrfaee220-1</p>
<p class="availability order-only">Delivery: <span> 1-2 months</span></p>

Regex should be within this:

preg_match_all("/<p class=\"psku\">SOMETHINGREGEX</span></p>/", $string, $info);
sirgeorge
  • 6,331
  • 1
  • 28
  • 33
  • 1
    Possible duplicate of [RegEx match open tags except XHTML self-contained tags](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – M. Eriksson May 05 '18 at 08:59
  • 2
    What is the reason you aren't using DOMDocument - http://php.net/manual/en/class.domdocument.php ? - worth reading: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags – konrados May 05 '18 at 09:00

1 Answers1

0

First of all, you are using / as delimiters in your regex. This is fine but you'd have to escape the forward slashes inside the regex like this:

/<p class=\"psku\">SOMETHINGREGEX<\/span><\/p>/

If you're like me and think this looks messy, you can also choose to use a different character as delimiter:

@<p class=\"psku\">SOMETHINGREGEX</span></p>@

Additionally, what's inside your SOMETHINGREGEX? I suspect it contains a dot (.). To enable the dot to match newline characters, and stretch across multiple lines, you need to add the s modifier:

@<p class=\"psku\">SOMETHINGREGEX</span></p>@s

However, like @konrados mentioned, using DOMDocument would be the best choice here. Using regex to parse HTML is very unreliable, as you have to account for a lot of formatting choices: tags written in capital letters, whitespace in places you wouldn't expect, etc. However, if you're certain that all your input is formatted in the same way, regex should do the trick.

Jonan
  • 2,485
  • 3
  • 24
  • 42