1

I am looking to use PHP to extract the image href value from an anchor link. See below for illustration:

Start:

<a href="http://someimagelink.com/image34.jpg">image34.jpg</a>

Goal:

http://someimagelink.com/image34.jpg

More specifically, how do I strip out the <a href=" and ">image34.jpg</a> so it works with any image that is given?

Andy Lester
  • 91,102
  • 13
  • 100
  • 152
JCHASE11
  • 3,901
  • 19
  • 69
  • 132

4 Answers4

2

You should look into PHP's DOM Parser: http://php.net/manual/en/domdocument.loadhtml.php

poplitea
  • 3,585
  • 1
  • 25
  • 39
2

You could also use Simple HTML DOM Parser like this:

$html = file_get_html('http://www.google.com/');

// Find all images 
foreach($html->find('img') as $element) 
    echo $element->src . '<br>';
Paul Dessert
  • 6,363
  • 8
  • 47
  • 74
  • How Do I refer to the current page instead of specifying an absolute URL? I am referring to the html variable. – JCHASE11 Dec 06 '12 at 00:35
0

What about doing something like this? I wasn't sure if this was a good solution or not...thoughts?

$html = "<a href='http://www.imagelink.com/images33.jpg'>images33.jpg</a>";
preg_match_all('/href=[\'"]?([^\s\>\'"]*)[\'"\>]/', $html, $matches);
$hrefs = ($matches[1] ? $matches[1] : false);
$url = implode('',$hrefs);
echo "<a href='$url'><img src='$url' width='100'></a>" ;
JCHASE11
  • 3,901
  • 19
  • 69
  • 132
  • You shouldn't use regex, since it's generally a bad idea to parse HTML or XML (which aren't regular languages) with regex. Read more here: http://stackoverflow.com/questions/701166/can-you-provide-some-examples-of-why-it-is-hard-to-parse-xml-and-html-with-a-reg – poplitea Dec 05 '12 at 21:22
  • Yes, definitely a bad idea to parse HTML content, but it works and is so easy....BUT, I understand this shouldnt be done. I will try and use a DOM parser. – JCHASE11 Dec 06 '12 at 00:31
0

You can use PHP DOMDocument to get what you need. Code would look something like this:

$dom_doc = DOMDocument::loadHTML($html_string);
$a_tag_node_list = $dom_doc->getElementsByTagName('a');
foreach($a_tag_node_list as $a_tag_node) {
    echo $a_tag_node->attributes->getNamedItem("href")->nodeValue;
}
Mike Brant
  • 70,514
  • 10
  • 99
  • 103