0

I have the following HTML (as an example)

<span class="small margin-l5 left">
  <a  data-user-id="" class="showdataemployer">
    <span>
      (0 Reviews)
    </span>
 </a>
</span>

I would like to extract the "0" from (0 Reviews)

I have defined the following function to scrape the data:

function scrape_between($data, $start, $end){
    $data = stristr($data, $start); // Stripping all data from before $start
    $data = substr($data, strlen($start));  // Stripping $start
    $stop = stripos($data, $end);   // Getting the position of the $end of the data to scrape
    $data = substr($data, 0, $stop);    // Stripping all data from after and including the $end of the data to scrape
    return $data;   // Returning the scraped data from the function
}

In this instance I am using the following to attempt to capture that 0.

$reviews = scrape_between($projectPage,
"<a  data-user-id=\"\" class=\"showdataemployer\"><span>(",
"Reviews)</span>");

But so far I am getting a blank return. Any ideas? I'm guessing most people will recommend to use pregex for this. But I can't seem to get my head around it. If it is the way to go could somebody please show me an example of how pregex could extract the 0 in this particular example?

Very much appreciate the help. Thanks guys.

Dingo Bruce
  • 405
  • 1
  • 7
  • 14

1 Answers1

1

Here's one way of doing it using the Simple HTML DOM Parser, http://simplehtmldom.sourceforge.net/manual.htm#section_traverse.

include_once 'simple_html_dom.php';
$html = str_get_html('<span class="small margin-l5 left">
  <a  data-user-id="" class="showdataemployer">
    <span>
      (0 Reviews)
    </span>
 </a>
</span>');
echo trim($html->find('span', 1)->plaintext);

Output:

(0 Reviews)

This doesn't come by default with PHP but can be obtained here, http://simplehtmldom.sourceforge.net/. For other parsers see this link, How do you parse and process HTML/XML in PHP?

Community
  • 1
  • 1
chris85
  • 23,846
  • 7
  • 34
  • 51