0

I want to get all of hyperlink in given site. so i write this code. but its not working properly. it only showing all hyperlink of given url only. But i want to get all hyperlink of given site.

<?php 
function getAlllinks($site){
$link = file_get_contents($site);
$dom = new DOMDocument;
@$dom->loadHTML($link);
$links = $dom->getElementsByTagName('a');

foreach ($links as $link){

     $url = $link->getAttribute('href');

if($url[0]!="#" && $url[0]!=" "){

    echo $url. '<br>';
    getAlllinks($url);

    }

}

}getAlllinks("http://www.example.com");
?>

for example in http://www.example.com

<html>
<body>
  <a href="index.php">Homepage</a>
  <a href="contact.php">Contact</a>
</body>
</html>

here first will show hyperlink index.php and contact.php & then will show all link of index.php and contact.php or the contact.php can be http://www.example.com/contact.php

Samuel Scott
  • 219
  • 1
  • 2
  • 8

1 Answers1

0

I think what you're trying to do, is crawl an entire website gathering all the links. Your code example isn't capable of doing that. What you want to do it load each page, grab every link on each page, the recurse on those links.

Checkout these links for more info:

How do I make a simple crawler in PHP?

https://en.wikipedia.org/wiki/Web_crawler http://phpcrawl.cuab.de/example.html

Community
  • 1
  • 1