How to get the content in a div using php

Question

In my application i am trying to get the google indexed pages and i came to know that the number is available in following div

<div id="resultStats"> About 1,960,000 results (0.38 seconds) </div>

now my question is how to extract the number from above div in a web page

score 4 · Answer 1 · edited May 23 '17 at 12:00

4

Never user regexp to parse HTML. (See: RegEx match open tags except XHTML self-contained tags)

Use a HTML parser, like SimpleDOM (http://simplehtmldom.sourceforge.net/)

You can the use CSS rules to select:

$html = file_get_html('http://www.google.com/');
$divContent =  $html->find('div#resultStats', 0)->plaintext;

$matches = array();
preg_match('/([0-9,]+)/', $divContent, $matches);
echo $matches[1];

Outputs: "1,960,000"

edited May 23 '17 at 12:00

Community

1
1

answered Jun 28 '13 at 08:09

netdigger

3,659
3
26
49

it will print `About 1,960,000 results (0.38 seconds) ` but he wanted number so regex is neccesary. – Robert Jun 28 '13 at 08:12
Once you've got the string out of the div, extracting the number from the string should be trivial. – GordonM Jun 28 '13 at 08:13
Quoting OP *now my question is how to extract the number from above div in a web page* I don't write that it's bad solution to use file_get_html() it's good but that's not what he wants. At least I understand it in that way. He wanted to know how to extract particular number. – Robert Jun 28 '13 at 08:14
Yeah, missed that. Added a regexp. – netdigger Jun 28 '13 at 08:20
1

This regexp prints `0.38 seconds` ehhh ;) Moreover you overwrite the result of function find() which is senseless too. – Robert Jun 28 '13 at 08:21
Now stealing regex from my answer which was **bad** haha :) My answer can be modifed not to use `
` and it will work to :> So in the end you use external library + regex(which is bad) - brilliant.
– Robert Jun 28 '13 at 08:27
I'd say our regexps is pretty different i'd say, yeah, I didnt read the full question at first, but [0-9,]+ I came up with myself, but it's kind the goto answer for anyone id say if trying to match 1,323,232,232.. etc. – netdigger Jun 28 '13 at 08:30
1

It helps me to move further.Thankz a ton – Jun 28 '13 at 09:55

Robert · Answer 2 · 2013-06-28T08:20:24.843

$str = '<div id="resultStats"> About 1,960,000 results (0.38 seconds) </div> ';

$matches = array();
preg_match('/<div id="resultStats"> About ([0-9,]+?) results[^<]+<\/div>/', $str, $matches);

print_r($matches);

Output:

Array ( 
        [0] => About 1,960,000 results (0.38 seconds)
        [1] => 1,960,000 
      )

This is simple regex with subpatterns

([0-9,]+?) - means 0-9 numbers and , character at least 1 time and not greedy.
[^<]+ - means every character but < more than 1 time

echo $matches[1]; - will print the number you want.

score 1 · Answer 3 · answered Jun 28 '13 at 08:05

1

You can use regex ( preg_match ) for that

$your div_string = '<div id="resultStats"> About 1,960,000 results (0.38 seconds) </div>';

preg_match('/<div.*>(.*)<\/div>/i', $your div_string , $result);

print_r( $result );

output will be

Array  (
   [0] => <div id="resultStats"> About 1,960,000 results (0.38 seconds) </div>
   [1] =>  About 1,960,000 results (0.38 seconds) 
)

in this way you can get content inside div

answered Jun 28 '13 at 08:05

softsdev

1,478
2
12
27

It'll match everything between the first opening
and the last closing
in the document. – GordonM Jun 28 '13 at 08:07
@GordonM but he want from specific div. – softsdev Jun 28 '13 at 08:08
i want the number from that div.but it gives something more that i dont want – Jun 28 '13 at 08:15

How to get the content in a div using php

3 Answers3