0

Hello I write following code to extract name and price from table with XPATH and curl.

   <?php
    include_once ("xpath.php");
    header('Content-type: text/html; charset=UTF-8');
    $ch = curl_init ("http://emalls.ir/%D9%84%DB%8C%D8%B3%D8%AA-%D9%82%DB%8C%D9%85%D8%AA~Category~39~Search~Nokia");
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    //$page = curl_exec($ch);
    $page = utf8_decode(curl_exec($ch));

    $dom = new DOMDocument();
    libxml_use_internal_errors(true);
    $dom->loadHTML($page);
    libxml_clear_errors();
    $xpath = new DOMXpath($dom);
    $data = array();


    // get all table rows and rows which are not headers
    $produstname = $xpath->query('//table/tbody/tr/td/a/text()');
    $produstprice = $xpath->query('//table/tbody/tr/td[8]/text()');
    $data = array();
    for ($x=0; $x<=1; $x++){
        $data[$x]['title'] = $produstname->item($x)->nodeValue;
        $data[$x]['price'] = $produstprice->item($x)->nodeValue;
    }
    ?>

These following two XPATH working on chrome to get name and price .

 name: $x("//table/tbody/tr/td/a/text()")
 price: $x("//table/tbody/tr/td[5]/text()")

but when use in following code give this error

 : Trying to get property of non-object in 
Kevin
  • 41,694
  • 12
  • 53
  • 70
amir rasabeh
  • 427
  • 8
  • 16

1 Answers1

1

I've seen the site, I humbly suggest target the id="" attribute instead. You can also use foreach too. Example:

$ch = curl_init ("http://emalls.ir/%D9%84%DB%8C%D8%B3%D8%AA-%D9%82%DB%8C%D9%85%D8%AA~Category~39~Search~Nokia");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch,CURLOPT_USERAGENT,'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
$page = curl_exec($ch);
$page = utf8_decode(curl_exec($ch));

$dom = new DOMDocument('1.0', 'utf-8');
libxml_use_internal_errors(true);
$dom->loadHTML($page);
libxml_clear_errors();
$xpath = new DOMXpath($dom);

$data = array();
$table_rows = $xpath->query('//table[@id="grdprice"]/tr'); // target the row (the browser rendered <tbody>, but actually it really doesnt have one)

if($table_rows->length <= 0) { // exit if not found
    echo 'no table rows found';
    exit;
}

foreach($table_rows as $tr) { // foreach row
    $row = $tr->childNodes;
    if($row->item(0)->tagName != 'th') { // avoid headers
        $data[] = array(
            'name' => trim($row->item(0)->nodeValue),
            'price' => trim($row->item(7)->nodeValue),
        );
    }
}

echo '<pre>';
print_r($data);

Sample Output

Kevin
  • 41,694
  • 12
  • 53
  • 70
  • Thank's.You're a professional PHP . why $data[] .. is Empty – amir rasabeh Sep 05 '14 at 06:09
  • @amirrasabeh thanks for the complement, but no, i haven't even scratched the surface of PHP, still a long ways to go – Kevin Sep 05 '14 at 06:11
  • 1
    @amirrasabeh that is just simply stating that if you do not explicitly put an index on the assignment, it will append it on the end of the array – Kevin Sep 05 '14 at 06:12
  • Do you have email. I want Always to be in touch with you – amir rasabeh Sep 05 '14 at 06:16
  • 1
    @amirrasabeh no need to email me, if i can help you inside the scope of SO and seen your question, i'll try to answer if i know – Kevin Sep 05 '14 at 06:19
  • Thank's @Ghost . what is trim in the trim($row-> – amir rasabeh Sep 05 '14 at 06:23
  • @amirrasabeh that is just to remove extra whitespaces on the text, try to remove it and check what happens – Kevin Sep 05 '14 at 06:24
  • I used your script top writed and manipulate (price' => trim($row->item(2)->nodeValue),) to get other table . this url http://wpu.ir/65pqe . But not support utf8. show character similar this "???? ???? ???? ??? ???" – amir rasabeh Sep 05 '14 at 06:30
  • @amirrasabeh try to add this one: `'name' => iconv('UTF-8', 'ISO-8859-1', trim($row->item(0)->nodeValue)),` that will solve your issue – Kevin Sep 05 '14 at 06:44
  • thank's. How remove tr tag with class in similar above script. – amir rasabeh Sep 05 '14 at 06:59