2

With the help of XPath, how to get the value of the href attribute in the following case (only grabbing the url that is the right one)?:

<a href="http://foo.com">a wrong one</a>
<a href="http://example.com">the right one</a>
<a href="http://boo.com">a wrong one</a>

That is, to get the value of the href attribute if the link has a particular text.

AlenD
  • 21
  • 1

5 Answers5

4

This will select the attributes:

"//a[text()='the right one']/@href"
Mihai Toader
  • 12,041
  • 1
  • 29
  • 33
1

i think this is the best solution, you can use each of them as an array element

$String=    '
<a href="http://foo.com">a wrong one</a>
<a href="http://example.com">the right one</a>
<a href="http://boo.com">a wrong one</a>
            ';

$array=get_all_string_between($String,'href="','">');
print_r($array);//just to see what is inside the array

//now get each of them
foreach($array as $value){
echo $value.'<br>';
}

function get_all_string_between($string, $start, $end)
{
    $result = array();
    $string = " ".$string;
    $offset = 0;
    while(true)
    {
        $ini = strpos($string,$start,$offset);
        if ($ini == 0)
            break;
        $ini += strlen($start);
        $len = strpos($string,$end,$ini) - $ini;
        $result[] = substr($string,$ini,$len);
        $offset = $ini+$len;
    }
    return $result;
}
Ryo
  • 65
  • 3
  • 11
0
"//a[@href='http://example.com']"
scragz
  • 6,670
  • 2
  • 23
  • 23
0

I'd use an opensource class like simple_html_dom.php

$oHtml = new simple_html_dom();
$oHtml->load($sBody)
foreach($oHtml->find('a') as $oElement) {
    echo $oElement->href
}
Simon
  • 5,158
  • 8
  • 43
  • 65
0

Here's a full example using SimpleXML:

$xml = '<html><a href="http://foo.com">a wrong one</a>'
        . '<a href="http://example.com">the right one</a>'
        . '<a href="http://boo.com">a wrong one</a></html>';
$tree = simplexml_load_string($xml);
$nodes = $tree->xpath('//a[text()="the right one"]');
$href = (string) $nodes[0]['href'];
scoffey
  • 4,638
  • 1
  • 23
  • 27
  • 2
    Use [.="the right one"] in preference to [text()="the right one"]. Because it's shorter, and because there might be comments in the value which would split it into multiple text nodes. – Michael Kay Jan 19 '11 at 10:14
  • But this selects the `a` elements, not `@href`. –  Jan 22 '11 at 19:15
  • True for the XPath query. But check the last assignment. – scoffey Jan 23 '11 at 04:49