0

I have a table:

<table class="table_class" >
    <tr>
        <td>key</td>
        <td>value</td>
    </tr>
</table>

The table may have any count of <tr>

I have regexp:

<table class="table_class">(<tr.*?><td>(.*?)</td><td>(.*?)</td></tr>){1,}</table>

But matches array contains only last match.

Just (<tr.*?><td>(.*?)</td><td>(.*?)</td></tr>) I can not do, because other table will may be.

Before apply preg_match_all I delete whitespaces. How do this? Thanks!

UPD: example with a few tables

<table>
    <tr>
        <td>key</td>
        <td>value</td>
    </tr>
</table>
<table class="table_class" >
    <tr>
        <td>key</td>
        <td>value</td>
    </tr>
</table>

yet, I will want to know why my regexp match only last <tr>))

ambrous
  • 194
  • 2
  • 11
  • First off, your example table has no `` tags... Can you give a full example with more than one table? Also, you're not allowing any whitespace between tags in your regex, though your example does have whitespace. – coffee-converter Dec 10 '15 at 20:45
  • 2
    Use a parser, not a regex, http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php. – chris85 Dec 10 '15 at 20:46
  • ok, I will try parser. Just I always thought that regexp more faster) – ambrous Dec 11 '15 at 08:17
  • yes, I do not add whitespace in regexp, just I delete them before apply preg_match_all. Oo.. tags, sorry, just one site have this HTML)) I corrected example. – ambrous Dec 11 '15 at 08:33
  • @ambrous you have two answers below demonstrating how to use parsers; neither work/help for you? If one does please be sure to accept it; http://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work. – chris85 Dec 11 '15 at 16:30

1 Answers1

1

Now usually I'm first to say it's fine to use regexps to extract data from HTML occasionally, as it's oft just faster and more efficient to do so than using a real parser. This is not one of those cases as the structure of the HTML is more than relevant.

Instead consider something like this:

$doc = DOMDocument::loadHTML(<<<HTML
<table class="table_class" >
    <tr><td>key1</td><td>value1</td></tr>
    <tr><td>key2</td><td>value2</td></tr>
    <tr><td>key3</td><td>value3</td></tr>
    <tr><td>key4</td><td>value4</td></tr>
</table>
HTML
);
foreach($doc->getElementsByTagName('tr') as $row) { 
  foreach($row->getElementsByTagName('td') as $cell)
    var_dump($cell->nodeValue);
}

See it in action here.

Niels Keurentjes
  • 41,402
  • 9
  • 98
  • 136