0

the scrape works but, the strange thing is that the result is ["-3°"]

I tried so many different things to get just -3°

But how is it that does [" and "] show up if they are not in the code!

Does someone can give me some direction how to achieve this

the code I am using is

<?php
function scrape($url){
$output = file_get_contents($url); 
return $output;
}

function fetchdata($data, $start, $end){
$data = stristr($data, $start); // Stripping all data from before $start
$data = substr($data, strlen($start));  // Stripping $start
$stop = stripos($data, $end);   // Getting the position of the $end of the    data to scrape
$data = substr($data, 0, $stop);    // Stripping all data from after and including the $end of the data to scrape
return $data;   // Returning the scraped data from the function
}

$page = scrape("https://weather.gc.ca/city/pages/bc-37_metric_e.html");   
$result = fetchdata($page, "<p class=\"text-center mrgn-tp-md mrgn-bttm-sm     lead\"><span class=\"wxo-metric-hide\">", "<abbr title=\"Celsius\">C</abbr>");
echo json_encode(array($result));    
?>

already thanks for you help!

bcman
  • 15
  • 4
  • 1
    Possible duplicate of [How do you parse and process HTML/XML in PHP?](http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php) – chris85 Nov 29 '15 at 20:14
  • Not sure what you try to say, the strange thing is that extra characters show up that I cant find in the code. – bcman Nov 29 '15 at 20:23
  • Chris85, I do not understand you answer, but thanks – bcman Nov 29 '15 at 20:24
  • I hadn't previously posted an answer. I've now posted an answer; take a look below. If that resolves your issue please be sure to accept it; if not please post question/issues. – chris85 Nov 29 '15 at 20:35

1 Answers1

0

You can use the DOMDocument to parse the HTML file.

$page = file_get_contents("https://weather.gc.ca/city/pages/bc-37_metric_e.html");
$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($page);
libxml_use_internal_errors(false);
$paragraphs = $doc->getElementsByTagName('p');
foreach($paragraphs as $p){
    if($p->getAttribute('class') == 'text-center mrgn-tp-md mrgn-bttm-sm lead') {
        foreach($p->getElementsbyTagName('span') as $attr) {
            if($attr->getAttribute('class') == 'wxo-metric-hide') {
                foreach($attr->getElementsbyTagName('abbr') as $abbr) {
                    if($abbr->getAttribute('title') == 'Celsius') {
                        echo trim($attr->nodeValue);
                    }
                }
            }
        }
    }
}

Output:

-3°C

This is assuming the classes and structure are consistent...

chris85
  • 23,846
  • 7
  • 34
  • 51