-2

I want to get tags from multiple pages (not SINGLE!) at once and store them by their element then print them.

I was able to parse the xml file and get all the links from it but adding multiple links separately looks to much of work.

I am trying to build data of TITLE IMG PRICE from a certain website using its XML file that already have all the links.

All help will be appreciated.

I use this code to get all the links:

<?php 
$urls = array(); 
$DomDocument = new DOMDocument(); 
$DomDocument->preserveWhiteSpace = false; $DomDocument->load('ivory.co.il/sitemap.xml'); 
$DomNodeList = $DomDocument->getElementsByTagName('loc'); 

foreach($DomNodeList as $url) { 
$urls[] = $url->nodeValue; 
} 
?>
Emil
  • 1,786
  • 1
  • 17
  • 22
Itay Joseph
  • 23
  • 1
  • 8
  • Please add more information to your question! What have you tried? How does the XML structure looks like? I'm not sure that I understand what you want to do? – Emil Aug 13 '15 at 16:42
  • How to get particular element from a complete website (Not single page)? – Itay Joseph Aug 13 '15 at 17:04
  • And you have a XML-file with all the links you want scrape? – Emil Aug 13 '15 at 17:07
  • yes i have the XML file ! i used this code preserveWhiteSpace = false; $DomDocument->load('https://www.ivory.co.il/sitemap.xml'); $DomNodeList = $DomDocument->getElementsByTagName('loc'); foreach($DomNodeList as $url) { $urls[] = $url->nodeValue; } //display it echo "
    ";
    print_r($urls);
    echo "
    "; ?> to get all the links .
    – Itay Joseph Aug 13 '15 at 17:08

1 Answers1

0

I'm not really sure what you want to do, but I'll try to help you. It's a pretty easy solution, just follow these steps:

  1. Download PHP Siple HTML DOM Parser from Soureforge.
  2. Loop trough your URL array with a foreach loop.
  3. Find the content you want to scrape on each URL.

It's hard to help you because I don't know what content you want to get from each URL. But take a look at my example, it will scrape data from different questions on SO.

<?php
include 'simple_html_dom.php';

$list   = array("http://stackoverflow.com/questions/31993435/how-i-can-use-simple-html-dom-php-to-use-all-the-links-inside-xml-and-get-the-ta",
                "http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php?rq=1");

foreach($list as $url) {

    $html   = file_get_html($url);
    foreach($html->find('#content') as $content) {

        $row['url']     = $url;
        $row['title']   = $content->find('h1', 0)->plaintext;
        $row['vote']    = $content->find('span.vote-count-post', 0)->plaintext;

        $result[]       = $row;
    }

}

?>

<pre>
<?php print_r($result); ?>
</pre>

OUTPUT:

Array
(
    [0] => Array
        (
            [url] => http://stackoverflow.com/questions/31993435/how-i-can-use-simple-html-dom-php-to-use-all-the-links-inside-xml-and-get-the-ta
            [title] => How i can use simple html dom php to use all the links inside xml and get the tags I want from them
            [vote] => -2 
        )

    [1] => Array
        (
            [url] => http://stackoverflow.com/questions/3577641/how-do-you-parse-and-process-html-xml-in-php?rq=1
            [title] => How do you parse and process HTML/XML in PHP?
            [vote] => 1186 
        )

)
Emil
  • 1,786
  • 1
  • 17
  • 22