1

I'm trying to capture the text "Capture This" in $string below.

$string = "</th><td>Capture This</td>";
$pattern = "/<\/th>\r.*<td>(.*)<\/td>$/";

preg_match ($pattern, $string, $matches);

echo($matches);

However, that just returns "Array". I also tried printing $matches using print_r, but that gave me "Array ( )".

This pattern will only come up once, so I just need it to match one time. Can somebody please tell me what I'm doing wrong?

hhwhy
  • 23
  • 2
  • 4

2 Answers2

3

The problem is that you require a CR character \r. Also you should make the search lazy inside the capturing group and use print_r to output the array. Like this:

$pattern = "/<\/th>.*<td>(.*?)<\/td>$/";

You can see it in action here: http://codepad.viper-7.com/djRJ0e

Note that it's recommended to parse html with a proper html parser rather than using regex.

Marcus
  • 12,296
  • 5
  • 48
  • 66
  • Thank you very much, Marcus. Can you suggest an HTML parser that would be best for a simple situation like this? Would you recommend a specific library, or should I use PHP's DOM functions? – hhwhy Nov 22 '11 at 16:26
  • @bow-viper1 this might shed some light: http://stackoverflow.com/questions/3577641/best-methods-to-parse-html-with-php/3577662#3577662 and http://stackoverflow.com/questions/292926/robust-mature-html-parser-for-php – Marcus Nov 22 '11 at 16:39
  • I would actually prefer to use PHP's DOM functions, but I just wasn't able to find any functions that can capture a single tag of many that don't have either an ID or class assigned to them. But, I will keep looking now that I understand that it's looked down upon to use regular expressions in this way. – hhwhy Nov 22 '11 at 17:30
1

Two things:

  1. You need to drop the \r from your regex as there is no carriage return character in your input string.

  2. Change echo($matches) to print_r($matches) or var_dump($matches)

codaddict
  • 445,704
  • 82
  • 492
  • 529