I need help with a REGEX that will find a link that comes in different formats based on how it got inserted to the HTML page.
I am capable of reading the pages into PHP. Just not able to the right REGEX that will find URL and insulate them.
I have a few examples on how they are getting inserted. Where sometimes they are plain text links, some of wrapped around them. There is even the odd occasion where text that is not part of the link gets inserted without spacing.
Both Article ID and Article Key are never the same. Article Key however always ends with a numeric. If this is possible I sure could use the help. Thanks
Here are a few examples.
http://www.example.com/ArticleDetails.aspx?ArticleID=3D10045411&AidKey=3D-2086622941
http://example.com/ArticleDetails.aspx?ArticleID=10919199&AidKey=1956996566
<a href="http://www.example.com/ArticleDetails.aspx?ArticleID=10773616&AidKey=1998267392">http://www.example.com/ArticleDetails.aspx?ArticleID=10773616&AidKey=1998267392</a>
<a href="http://www.example.com/ArticleDetails.aspx?ArticleID=10773616&AidKey=1998267392">This is a link description</a>
http://example.com/ArticleDetails.aspx?ArticleID=10975137&AidKey=701321736this is not part of the url.
In the end I am just looking for the URL.
http://example.com/ArticleDetails.aspx?ArticleID=10975137&AidKey=701321736