how to grab a html tag content with preg_match_all

Question

i have some html codes, which contains these :

<table class="qprintable2" width="100%" cellpadding="4" cellspacing="0" border="0">
content goes here !
</table>

i have this function to match the tag inside

function getTextBetweenTags($string, $tagname)
{
  $pattern = "/<table class=\"class1\" width=\"100%\" cellpadding=\"4\" cellspacing=\"0\" border=\"0\">(.*?)<\/$tagname>/"; 
  preg_match_all($pattern, $string, $matches);
  return $matches[1];
}

but it doesn't have, so i will be highly appreciate if you can give me a good pattern for this :(

possible duplicate of [RegEx match open tags except XHTML self-contained tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) — Brad Mace, Dec 18 '10 at 02:34
THE AWESOMEST REGEX HTML PARSER EVAR: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 — Ignacio Vazquez-Abrams, Dec 18 '10 at 02:36

score 4 · Answer 1 · answered Dec 18 '10 at 02:38

You should avoid this, but you can use a regex like:

preg_match('#<table[^>]+>(.+?)</table>#ims', $str);

The various tricks here are:

/ims modifier so that "." also matches newlines, case-insensitive, multiline options (^ and $)
using # instead of / for enclosing the regex, so you don't have to escape html closing tags
using [^>]+ to make it unspecific and avoid listing individual html attributes (more reliable)

While this is a case where regexs would work okayish, the general consensus is that you should use QueryPath or phpQuery (or alike) to extract html. It's also mucho simpler:

qp($html)->find("table")->text();  //would return just the text content

how to grab a html tag content with preg_match_all

1 Answers1