0

How would i extract anchor href of given HTML, based on image src?

Example:

<a href="http://idontneedthis.com"><img src="path/to/image/1.gif" /></a>
<a href="http://iwantthis.com"><img src="path/to/image/2.gif" /></a>
<a href="http://idontneedthisagain.com"><img src="path/to/image/3.gif" /></a>

In this case i would need to get link of linked image with src of 2.gif. That would be anchor that has href http://iwantthis.com

RhymeGuy
  • 2,102
  • 5
  • 32
  • 62

2 Answers2

1

Here is a way you can utilize DOM and XPath to extract those @href values.

$doc = DOMDocument::loadHTML('
    <a href="http://idontneedthis.com"><img src="path/to/image/1.gif" /></a>
    <a href="http://iwantthis.com"><img src="path/to/image/2.gif" /></a>
    <a href="http://idontneedthisagain.com"><img src="path/to/image/3.gif" /></a>
');

$xpath = new DOMXPath($doc);
$links = $xpath->query('//a[img[contains(@src, "2.gif")]]');

foreach ($links as $link) {
   echo $link->getAttribute('href');
}

Output

http://iwantthis.com
hwnd
  • 69,796
  • 4
  • 95
  • 132
0

Using regexp to solve this kind of problem is a bad idea and will likely lead in unmaintainable and unreliable code. Better us an HTML parser.

If you still want to use a regex, you can try:

preg_match_all('%href="(.*?)".*?src="path/to/image/2\.gif"%i', $html, $match, PREG_PATTERN_ORDER);
$href = $match[1][0];
echo $href ;

Output:

http://iwantthis.com
Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268