Preg_Match_All not working as tested

Question

This is not like the "duplicate" which explains alternatives to regexp, but doesn't explain a solution to this problem.

I am trying to use preg_match_all to parse a scraped page (http://www.sportsbookreview.com/betting-odds/). I have tested my regexp />([A-Z]+) - / on two sites (http://www.phpliveregex.com/ and functions-online.com/preg_match_all.html) and it works in both cases. I have also pasted the snippet I am parsing directly into my code. In all those cases, it works, but when I run it on live data, it returns no results.

My only theory is that there is a hidden character in the site that doesn't copy when I cut and paste into the live testing sites.

The full code is below. Thanks for your help.

<?php

function curl($url) {
$curlAgent= 'Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.0.3705; .NET CLR 1.1.4322)';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_USERAGENT, $curlAgent);

$data = curl_exec($ch);
curl_close($ch);
return $data;   
}

$strUrl = 'http://www.sportsbookreview.com/betting-odds/';

$data = curl($strUrl);

$strGames = explode('@id',$data);

echo "<br>Number of games on page: ".count($strGames)."<br>";

for ($i = 1; $i < count($strGames); $i++) {
//    echo $strGames[$i];
    $clean = preg_replace('/[^\PC\s]/u', '', $strGames[$i]);
    $error = preg_match_all("~>([A-Z]+) - ~m", $clean, $strTeams);
    var_dump($strTeams);
}

?>

Try `$clean = preg_replace('/[\p{C}\s]+/u', '', $strGames[$i]);` and then `$error = preg_match_all("~>([A-Z]+)-~", $strGames[$i], $strTeams);` — Wiktor Stribiżew, Apr 14 '17 at 22:46

Preg_Match_All not working as tested

0 Answers0