1

I want to get the links "http://www.w3schools.com/default.asp" & "http://www.google.com" from this webpage.I want the links of <a> tags inside <div class="link">,there are many other <a> tags in this page and I don't want them. How can I retrieve the particular links only? Can anyone help me?

<div class="link">
<a href="http://www.w3schools.com/default.asp">
<h4>W3 Schools</h4>
</a>
</div>
<div class="link">
<a href="http://www.google.com">
<h4>Google</h4>
</a>
</div>
sarath
  • 343
  • 8
  • 31
  • Check this http://stackoverflow.com/questions/2720805/php-regular-expression-to-get-a-url-from-a-string – Sean Doe Dec 19 '13 at 11:21

3 Answers3

5

Use a DOM Parser such as DOMDocument to achieve this:

$dom = new DOMDocument;
$dom->loadHTML($html); // $html is a string containing the HTML

foreach ($dom->getElementsByTagName('a') as $link) {
    echo $link->getAttribute('href').'<br/>';
}

Output:

http://www.w3schools.com/default.asp
http://www.google.com

Demo.


UPDATE: If you only want the links inside the specific <div>, you can use an XPath expression to find the links inside the div, and then loop through them to get the href attribute:

$dom = new DOMDocument;
$dom->loadHTML($html);

$xpath = new DOMXPath($dom);
$links_inside_div = $xpath->query("//*[contains(@class, 'link')]/a");

foreach ($links_inside_div as $link) {
    echo $link->getAttribute('href').'<br/>';
}

Demo.

Amal Murali
  • 75,622
  • 18
  • 128
  • 150
1
$dom = new DOMDocument;
$dom->loadHTML($html);
foreach ($dom->getElementsByTagName('a') as $node)
{
  echo $node->nodeValue.': '.$node->getAttribute("href")."\n";
}
웃웃웃웃웃
  • 11,829
  • 15
  • 59
  • 91
1

You can use snoopy PHP class . Snoopy is a PHP class that simulates a web browser. It automates the task of retrieving web page content and posting forms, http://sourceforge.net/projects/snoopy/

Otherwise try to using Jquery

 <script src="http://ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js">
 <script type="text/javascript">
    $( document ).ready(function() {
         $( ".link a" ).each(function( index ) {
             var link = $( this ).attr("href") );
             alert(link );
         });
    });
</script>

You can also get all links using this one also (javascript)

 var list = document.getElementsByTagName("a");
Sudheesh.R
  • 356
  • 2
  • 9