0

I am trying to get url content in ebay and search there for titles. I am using the php html simple dom parser and its function file_get_html. However when I try to print my result, my script freezes. Firstly I am building the urls using some data from a csv then I open the first result from the search and when I try to get the url content, my function failed. The data from the csv file contains some MPN like these:

enter image description here

Here is my code:

$itemsUrl = readCSV(realpath(dirname(__FILE__)) . DS . 'JeepToysEbayIsr.csv');

foreach ($itemsUrl as $itemNumber => $itemUrl) {

print_r($itemNumber . "\n");
//$url = "https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=p2380057.m570.l1313.TR0.TRC0.H0.X3342827.TRS0&_sacat=0&_nkw=".$itemUrl['MPN'];
//print_r($item);

//$data = get_web_page($url,"\n");

include_once("simple_html_dom.php");

$context = stream_context_create(array(
'http' => array(
    'header' => array('User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; rv:2.2) Gecko/20110201'),
),
));

$url[] =("https://www.ebay.co.uk/sch/i.html?_from=R40&_trksid=p2380057.m570.l1313.TR0.TRC0.H0.X3342827.TRS0&_sacat=0&_nkw=".$itemUrl['MPN']);

foreach($url as $value) {

    preg_match('/https?\:\/\/[^\" ]+/i', $value,$match);
} 
 if (isset($match[0])) {
   $data = file_get_html($match[0], "\n"); 
   print_r($data);
 }

}

1 Answers1

0

Maybe you can try to call the url with curl or loadHTMLFile and additionally use xpath to get your content like :

$doc = new DOMDocument();
$doc->loadHTMLFile('https://www.myurl.com', LIBXML_NOERROR | LIBXML_NOWARNING);
$xpath = new DOMXpath($doc);
$var = $xpath->query('//div[contains(@class,"theClass")]');

and then :

print_r($var->item(0))
Florian Richard
  • 80
  • 1
  • 10
  • However `loadHTMLFile` is not in the `simple_html_dom` file – Ico Vladimirov Nov 06 '19 at 13:52
  • I also tried with `get_web_page` function but the parsing was difficult then. – Ico Vladimirov Nov 06 '19 at 13:54
  • Yes it's true but with loadHTMLFile you can also save your html inside a variable `$myHtml = $doc->saveHTML();` and maybe put it inside `file_get_html` ? If they don't work try with curl or without `simple_html_dom` like my exemple ? – Florian Richard Nov 06 '19 at 14:03
  • `if (isset($match[0])) { $data = get_web_page($match[0], "\n"); //print_r($data); $page = $data['content']; print_r($page); }` This is how I edited it and the html page is ok – Ico Vladimirov Nov 06 '19 at 14:05
  • Now the challenge is how to get some titles, images, etc from each link content :D – Ico Vladimirov Nov 06 '19 at 14:11