1

I am trying to search a text string in PHP. For this i am loading the complete webpage source into a variable using

$filename = "http://google.com/";
$filehandle = fopen($filename, "rt");
$contents = fread($filehandle, 10000);

Now to read the data inside the span id we have :

<span style="font-size:18px" id="countdown">4d 19h 34m 43s</span>

I have written the peice of code but it is not working for me:

$string = "id\=\"countdown\"";

if(strstr($contents,$string)) {
echo "found it.";
} else {
echo "not found.";
}

i wish to use some operator like (.+) we can use in PERL where if we make a string match with the syntax

~/abc(.+)ghi/

then data between abc,ghi is assigned to variable $1.

typedefcoder2
  • 300
  • 1
  • 10
  • 22
  • 1
    Firstly, do not try to parse html's with REGEX, read this - http://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-reg Secondly, your filename is Google.com, what are you trying to do? Download the Internets? – GEMI Dec 10 '11 at 07:33

3 Answers3

0

The PHP equivalent of the Perl's:

if($var=~/abc(.+)ghi/) {
  print $1;
}

is:

if(preg_match('/abc(.+)ghi/', $var, $match) {
  print $match[1];
}

But to answer you original question of using regex to parse HTML, I suggest you look at proper HTML parsers.

codaddict
  • 445,704
  • 82
  • 492
  • 529
0

As far as your example goes; you don't need to escape the = sign:

$string = "id=\"countdown\"";

if(strstr($contents,$string)) {
  echo "found it.";
} else {
  echo "not found.";
}

Alternatively you can use single quotes:

$string = 'id="countdown"';

This should solve your strstr() call, but i agree with codaddict's suggestion to use preg_match().

dchrastil
  • 582
  • 3
  • 5
0

ok, lets take the preg_match approach; this will search in between the the span tags and pull out the data:

preg_match("/<span style="font-size:18px" id="countdown">(.+)<\/span>/", $contents);

which should output something similar to this:

Array
(
    [0] => <span style="font-size:18px" id="countdown">4d 19h 34m 43s</span>
    [1] => 4d 19h 34m 43s
)
dchrastil
  • 582
  • 3
  • 5