1

Possible Duplicate:
php regex to get string inside href tag

I have a text file where there are multiple occurrences of href tag.

I wish to get the content of these

href='...'
and print it to screen.

How can I achieve that? The main problem is writing a correct regex.

Community
  • 1
  • 1
Giorgio
  • 1,603
  • 5
  • 29
  • 52
  • 1
    Sorry I've just noticed there's an identic question out there: http://stackoverflow.com/questions/4001328/php-regex-to-get-string-inside-href-tag?rq=1 – Giorgio Sep 04 '12 at 11:43
  • It just doesn't end, does it?[Die, Cthulu, Die](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) _Parse the HTML_ – Elias Van Ootegem Sep 04 '12 at 11:46
  • 4
    I've voted to close my own question :) – Giorgio Sep 04 '12 at 11:47
  • 1
    @EliasVanOotegem grabbing hrefs in a given html file isn't really "parsing html", and if you know what the HTMl is going to look like (for example a flat list of links), it might be better to use regexes than a full-blown parser. – Quentin Pradet Sep 04 '12 at 11:49
  • @Cygal: in rare cases, you could be right, but regex's will only take you so far, parsing the string is just so much more reliable – Elias Van Ootegem Sep 04 '12 at 11:52

2 Answers2

4

Here you go:

$pageData = file_get_contents('your.txt');
if(preg_match_all('/<a\s+href=["\']([^"\']+)["\']/i', $pageData, $links, PREG_PATTERN_ORDER))
    $all_hrefs = array_unique($links[1]);

Now you have all unique hrefs in $all_href;

if you want to display them:

foreach($all_href as $href)
{
echo $href;
}
Florian Bauer
  • 626
  • 3
  • 12
-1
preg_match_all('|<a href="(.+)">|', $file_content, $matches);
print_r($matches);

Not tested, but should work

Jakub Truneček
  • 8,800
  • 3
  • 20
  • 35