HTML is not a regular language, so you should not use regular expressions to parse it. Use a DOM parser like DOMDocument
instead. However, for the sake of learning, I will show what was wrong with your expression.
However, your problem is that ?
is a reserved character meaning "optional" and .
is a reserved character meaning any character. Escape them using \
:
<a target="frameleft" href="Home\.aspx\?t=\d+">(.*?)<\/a>
Also, the s
modifier means dot-matches-newline. So, unless you expect the links to have line breaks in them, it is unnecessary.
I also just noticed that you wanted the "t" value. Currently you are using a capture group on the contents of the link ((.*?)
), instead you want to capture the value of t (\d+
). You'll want to modify this to:
<a target="frameleft" href="Home\.aspx\?t=(\d+)">.*?<\/a>