0

I am learning to use Regular expressions and would like to grab some data from a table:

The file looks like this:

$subject = 
<tbody>
            <tr>
                <td>1</td>
                <td>2</td>
                <td>3</td>
            </tr>
            <tr>
                <td>4</td>
                <td>5</td>
                <td>6</td>
            </tr>
        </tbody>

Currently I am doing the following:

$pattern = "/<tr>.*?<td><\/td>.*?<td>(.*?)<\/td>.../s";

preg_match( $pattern, $subject, $result);

This will output an array:

$result = [
    0 => "tbody>...",
    1 => 1,
    2 => 2,
    3 => 3,
    4 => 4 ... n     
]

This seems inefficient so I am attempting to grab a repeated pattern like so:

$pattern = "/<td>([0-9]{1,2})<\/td>/s";

This however only grabs the first number: 1

What would be the best way to go about this?

HappyCoder
  • 5,985
  • 6
  • 42
  • 73

3 Answers3

2

You should use preg_match_all instead of preg_match to perform the search on the entire var

http://php.net/manual/en/function.preg-match-all.php

if (preg_match_all( $pattern, $subject, $matches)) {
    var_dump($matches);
}
jiboulex
  • 2,963
  • 2
  • 18
  • 28
1

Here's a way to accomplish this using a parser:

$subject = '
<tbody>
            <tr>
                <td>1</td>
                <td>2</td>
                <td>3</td>
            </tr>
            <tr>
                <td>4</td>
                <td>5</td>
                <td>6</td>
            </tr>
        </tbody>';
$html = new DOMDocument();
$html->loadHTML($subject);
$tds = $html->getElementsByTagName('td');
foreach($tds as $td){
    echo $td->nodeValue . "\n";
    if(is_numeric($td->nodeValue)) {
        echo "it's a number \n"; 
    }
}

Output:

1
it's a number 
2
it's a number 
3
it's a number 
4
it's a number 
5
it's a number 
6
it's a number 
chris85
  • 23,846
  • 7
  • 34
  • 51
  • This looks interesting and may actually work for my needs. I just tried it out and I see it dies if there is an invalid tag in the HTML... is there a way around this? – HappyCoder Dec 23 '15 at 17:11
  • It shouldn't die, it should just throw warnings, https://eval.in/489883 if you uncomment the two lines there you'll see the errors go away. – chris85 Dec 23 '15 at 17:18
  • Are you able to recommend documentation for this? I would like to learn what else I can do. – HappyCoder Dec 23 '15 at 17:22
  • The PHP site does, http://php.net/manual/en/book.dom.php. There's also a write up here on it, http://stackoverflow.com/questions/4979836/domdocument-in-php/4983721#4983721 – chris85 Dec 23 '15 at 17:23
0

To get all the values and not stopping after the first match you need to use the g flag.

In php this is implemented in the preg_match_all function.

Since the data will always be contained in a td you can do the following:

preg_match_all("/<td>(.*)<\/td>", $subject, $matches);
var_dump($matches);

Where the $subject contains you html and you should see an array of all your table data.

Calle Bergström
  • 480
  • 4
  • 12