I'm creating a regex. This is my test dataset:
<a href="test.html">test1</a>
<a href="test.pdf">test2</a>
<a href="test.html">test1</a>
<a href="test.html">test1</a><a href="testtime.pdf">test2</a>
I'm trying to capture from "href=" to "pdf", but the following regex:
href=.*?\.pdf
Will capture the right data if it is isolated to one line, but it will also match the following from the last line:
href="test.html">test1</a><a href="testtime.pdf
I only want from the last "href" to the ".pdf", I don't want the first "href" on the line or anything that comes between it and the second "href". Is it possible to modify the regex to match this properly?
Thanks.