3

So I am trying to find a list of steam ID's in a scraped html file. This is what I have so far but it's not working, it is parsing an html page I saved as text and supposed to be outputting things with below variables and it is outputting a blank page.

   <?php
$filein = file('TF2U.txt');
foreach ($filein as $html) {
    $pattern = '#.*<a[^>]+href="steamcommunity.com/profiles/([0-9]+)/"#iA';
    $matches = NULL;
    $match_count = preg_match_all($pattern, $html, $matches);
    if ($match_count > 0) {
        echo implode($matches[1]);
        echo "<br>\n";
        }
}
?>

Any help would be awesome, I am not sure what I am missing but it's probably simple.

SuperMar1o
  • 670
  • 8
  • 23

1 Answers1

2

The problem is that the links aren't ending with a /, so here's a solution with some tweaks:

$file = file_get_contents('TF2U.htm');
preg_match_all('#<a.*?href="(?:http://)steamcommunity.com/profiles/(?P<id>\d+)[^‌​>]+#msi', $file, $matches);
print_r($matches['id']);
HamZa
  • 14,671
  • 11
  • 54
  • 75