2

Here is my regex:

$table_pattern = "/<TABLE.*?>(.*?)<\/TABLE>/is";

Like the title says, it works in 5.1 and 5.3, but not 5.2. I'm using it in this preg_match:

preg_match_all($table_pattern, $page_content, $table_content);

$table_content is NULL on 5.2, but populated on 5.1 and 5.3. Any idea as to why?

Additional details:

$car_count = 47; //How many cars are currently online
$page_content = file_get_contents('http://www.site.com/temps/inventory.cfm?ChangeItems='.$car_count);; // What page will be parsed
$page_start = 10277; //Where the parsing should start

$page_content = substr($page_content, $page_start); //Removes all of the text above the table we need
$table_pattern = "/\<TABLE.*?\>(.*?)\<\/TABLE\>/is";
preg_match_all($table_pattern, $page_content, $table_content); //Finds all tables inside of $page_content and fills the $table_content array
$final_content = $table_content[0][0]; //Setting the first table, which is the match we need, to $table

$final_content is coming up as NULL. Obviously there is more happening below this in my code but it's irrelevant.

I solved my own problem by - wait for it - NOT using RegEx! But really, I initially thought this would be much faster than dealing with the PHP Simple HTML Parser, but it wasn't. But I am still curious as to why this will not work in certain versions.

ohiock
  • 646
  • 3
  • 7
  • 22
  • 10
    [The pony he comes...](http://stackoverflow.com/a/1732454/554546) –  Feb 10 '12 at 21:35
  • 5
    Maybe 5.2 was smart enough to avoid regexing xml. :) – Jonathan M Feb 10 '12 at 21:36
  • possible duplicate of [RegEx match open tags except XHTML self-contained tags](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) –  Feb 10 '12 at 21:41
  • 3
    @tbowman - Can you post the HTML snippet that fails in 5.2 but works on all other versions? – nickb Feb 10 '12 at 22:07
  • 1
    Do you have full error reporting on? `ini_set('display_errors', 1)` http://codepad.org/jvr99JBF – Shad Feb 10 '12 at 22:12
  • It's an entire web page so I am not going to post it all. I'll turn on error reporting and see what I get out of it. (Edit: no dice) – ohiock Feb 11 '12 at 16:18
  • Perhaps the [configuration directives](http://www.php.net/manual/en/pcre.configuration.php) are different across the versions? – cmbuckley Feb 11 '12 at 21:36

1 Answers1

0

It is possible to parse XML in PHP with recursive regex, but please use an XML library instead of regex. It's much cleaner and easier...

(Your code fails if you have nested "tables" in your XML...)

The PCRE is developed by another team than the php, and the older versions have bugs. Maybe there is a bug in the PCRE passed to the php 5.2 which is fixed by the later versions.

Another explanation could be, that you have unicode xml, and you didn't use the "u" flag.

By me it works on php 5.2.17. Which version do you have?

inf3rno
  • 24,976
  • 11
  • 115
  • 197
  • I have since moved on to use the PHP Simple HTML DOM Parser to accomplish what I needed. I can't recall what exact version it was, I'll have to check again. – ohiock Feb 15 '12 at 17:47