Shoudln't a link be a well-defined regex? This is a rather theoretical question,
I second PEZ's answer:
I don't think HTML lends itself to "well defined" regular expressions since it's not a regular language.
As far as I know, any HTML tag may contain any number of nested tags. For example:
<a href="http://stackoverflow.com">stackoverflow</a>
<a href="http://stackoverflow.com"><i>stackoverflow</i></a>
<a href="http://stackoverflow.com"><b><i>stackoverflow</i></b></a>
...
Thus, in principle, to match a tag properly you must be able at least to match strings of the form:
BE
BBEE
BBBEEE
...
BBBBBBBBBBEEEEEEEEEE
...
where B means the beginning of a tag and E means the end. That is, you must be able to match strings formed by any number of B's followed by the same number of E's. To do that, your matcher must be able to "count", and regular expressions (i.e. finite state automata) simply cannot do that (in order to count, an automaton needs at least a stack). Referring to PEZ's answer, HTML is a context-free grammar, not a regular language.