0

i have some html codes, which contains these :

<table class="qprintable2" width="100%" cellpadding="4" cellspacing="0" border="0">
content goes here !
</table>

i have this function to match the tag inside

function getTextBetweenTags($string, $tagname)
{
  $pattern = "/<table class=\"class1\" width=\"100%\" cellpadding=\"4\" cellspacing=\"0\" border=\"0\">(.*?)<\/$tagname>/"; 
  preg_match_all($pattern, $string, $matches);
  return $matches[1];
}

but it doesn't have, so i will be highly appreciate if you can give me a good pattern for this :(

ajreal
  • 46,720
  • 11
  • 89
  • 119
mrvivi
  • 1
  • 1
  • 1
  • 1
  • possible duplicate of [RegEx match open tags except XHTML self-contained tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) – Brad Mace Dec 18 '10 at 02:34
  • 1
    THE AWESOMEST REGEX HTML PARSER EVAR: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Ignacio Vazquez-Abrams Dec 18 '10 at 02:36

1 Answers1

4

You should avoid this, but you can use a regex like:

preg_match('#<table[^>]+>(.+?)</table>#ims', $str);

The various tricks here are:

  • /ims modifier so that "." also matches newlines, case-insensitive, multiline options (^ and $)
  • using # instead of / for enclosing the regex, so you don't have to escape html closing tags
  • using [^>]+ to make it unspecific and avoid listing individual html attributes (more reliable)

While this is a case where regexs would work okayish, the general consensus is that you should use QueryPath or phpQuery (or alike) to extract html. It's also mucho simpler:

qp($html)->find("table")->text();  //would return just the text content
mario
  • 144,265
  • 20
  • 237
  • 291