1

I've currently got this code so far:

<?php
$curl = curl_init('WebHere');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);

$page = curl_exec($curl);

if(curl_errno($curl)) 
{
    echo 'Scraper error: ' . curl_error($curl);
    exit;
}

curl_close($curl);

$regex = '/<div class="stockinfo1">(.*?)<\/div>/s';
if ( preg_match($regex, $page, $list) )
    echo $list[0];
else 
    print "Not found"; 
?>

I'm trying to target a specific piece of a website, it's in a div class named stockinfo1 how can I pull only that infomation, without the full website?

roibeart
  • 82
  • 6
Paul Brennan
  • 73
  • 1
  • 3
  • 11

1 Answers1

3

To retrieve the html portion that you need a solution can be use regular expressions, but someone will not be so happy. An alternative is the use of a library that allows you to parse the DOM of the page, like PHP Simple HTML DOM Parser. This is very simple to use, especially if you have experience with jQuery.

A solution for you using PHP Simple HTML DOM Parser can be the following:

$html = file_get_html($url); // you don't need to use curl
$yourDesiredContent = $html->find('div.stockinfo1', 0)->plaintext;

Anyway, if you want to use regular expressions, edit your code changing echo $list[0]; to echo $list[1];: you have to print only the contents inside the parenthesis of your regular expressions, which corresponds to the group number 1 (and the only one).

Community
  • 1
  • 1
roibeart
  • 82
  • 6
  • Thanks but now I get this error? Fatal error: Call to undefined function file_get_html() in /home/cabox/workspace/Section3/section3.php on line 5 – Paul Brennan Jan 30 '16 at 18:23
  • Have you downloaded the library php file (http://sourceforge.net/projects/simplehtmldom/files/) and included it in the script? – roibeart Jan 30 '16 at 18:47
  • Nope, What would i type? include ___ ? – Paul Brennan Jan 30 '16 at 19:05
  • 1
    Download the file simple_html_dom.php from sourceforge.net/projects/simplehtmldom/files then place it in the same folder of your script and finally write on the top of your file, under the ` – roibeart Jan 30 '16 at 19:14
  • @PaulBrennan Consider what roibeart commented above. You can use the code you did yourself, but change the echo `$list[0]` to echo `$list[1]`. That would print the match in the brackets for the regular expression, that seems what you are looking for. – MarkSkayff Jan 30 '16 at 20:39
  • @MarkSkayff still does not work apparently, as I put in, It says "Not Found" – Paul Brennan Jan 30 '16 at 21:10
  • Ugh!!!! I'm done. My bad eye site forgot to spot the little 'r' after stockinfo.... Thanks for your help guys! – Paul Brennan Jan 30 '16 at 21:12