2

I recently stumbled upon this:

PCRE Regex Syntax - Recursive Patterns

It appears to open up possibilities to "match" html tags, which regular expresisons were not good at. Can this experimental feature, in any way, be used to parse fragments of HTML? or the document, if possible?

Salman A
  • 262,204
  • 82
  • 430
  • 521
  • 5
    It can parse HTML just like you can dig a grave with a spoon. – alex May 18 '12 at 15:58
  • possible duplicate of [Oh Yes You Can Use Regexes to Parse HTML!](http://stackoverflow.com/questions/4231382/regular-expression-pattern-not-matching-anywhere-in-string/4234491#4234491) – mario May 18 '12 at 15:58
  • @mario: does not say anything about recursion. – Salman A May 18 '12 at 16:24
  • It would be easier to parse html using substr and strpos imho. With regex, it is near impossible to take into account all possibilities, malformed html being the top offender. – dqhendricks May 18 '12 at 16:48

1 Answers1

3

It is highly recommended not to use regex for HTML parsing, recursion or no. People often use it because when you have a hammer, the world looks like a nail. The correct tool would be something more like PHP's DOMDocument class, which is fully designed for solving exactly this type of problem.

http://php.net/manual/en/class.domdocument.php

dqhendricks
  • 19,030
  • 11
  • 50
  • 83