0

I am hoping that someone can help me to make this match non greedy... I am using Javascript and ASP Classic

.match(/(<a\s+.*?><\/a>)/ig);

The purpose is to extract URL's from a page in this format <a href ></a>

I need to capture just the url

Thanks

AndersTornkvist
  • 2,610
  • 20
  • 38
Gerald Ferreira
  • 1,349
  • 2
  • 23
  • 44
  • Oh no, not yet another lost soul trying to parse HTML with Regex. Read this answer as to why this is bad: http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – Darin Dimitrov Jul 14 '10 at 20:25
  • 1
    He is not trying to parse HTML, simply capture a value that has well-defined boundaries (href="...") in a string of text. Regex as a general HTML parser does not work, for for specific value cases, it's often viable. – Josiah Jul 14 '10 at 20:26
  • 1
    Oh I highly doubt it is so constrained because the address that you might have in this `href` could easily break any Regex attempt. – Darin Dimitrov Jul 14 '10 at 20:26
  • 1
    Like for example: `Alert`. – Nebril Jul 14 '10 at 20:32
  • jip - i have perfect formed href's that I just needed to strip – Gerald Ferreira Jul 14 '10 at 20:36

1 Answers1

1

Try the following:

.match(/(<a\s+.*?href="(.*?)".*?>/)/ig);
Josiah
  • 4,754
  • 1
  • 20
  • 19
  • match(/href="(.*?)"/ig, ''); <<< I have derived this from your idea... Thanks a million it now captures href="domainname.com" just what I needed! – Gerald Ferreira Jul 14 '10 at 20:35