I want to extract URLs from this text:
<body>
<a href="http://domaine.com/t/text/text"> <img src="http://domaine.com/i/text/text"></a> <br>
<a href="http://domaine.com/text"></a> <br>
<a href="http://domaine.com"></a> <br>
<a href="http://domaine.com/text/text"></a> <br>
<a href="http://[GoTo]"></a> <br>
<a href="http://[NextURL]"></a> <br>
</body>
but i want to exclude some URLs with specific patterns from being extracted; those patterns are:
http://***/i/***/***
http://***/t/***/***
http://[GoTo]
http://[NextURL]
which means i will just get this URLs as a result:
http://domaine.com/text
http://domaine.com
http://domaine.com/text/text
what i did so far is using this Regex:
$regex = '/https?\:\/\/[^\" ]+/i';
preg_match_all($regex, $string, $matches);
print_r($matches[0]);
but as you can notice i get all the URLs extracted, and i don't know how to exclude some of them using my specific petterns.