-1

I want a regular expression to fetch any URL between double quotes.

<a href="http://www.any-web_address.com">
<a href="http://142.12.64.71:8083">
Alan Moore
  • 73,866
  • 12
  • 100
  • 156
user561810
  • 11
  • 4

5 Answers5

0
"http://[0-9 a-z A-Z . : ]{1,100}"
F.P
  • 17,421
  • 34
  • 123
  • 189
user561810
  • 11
  • 4
  • 1
    Although this code may be help to solve the problem, providing additional context regarding _why_ and/or _how_ it answers the question would significantly improve its long-term value. Please [edit] your answer to add some explanation. In particular, I don't see how this matches URLs such as `ftp://server/file` or `mailto:user@host`... – Toby Speight Jul 04 '16 at 16:46
0

Something like this?

\"\K([\w\:\/\.\-]+)

If You want with double quotes (it was said "fetch any url between double quotes" so I thought without \"):

\"([\w\:\/\.\-]+)\"
Michał M
  • 618
  • 5
  • 13
  • Side question: Why the `\K` if you match with a group anyway? Just curious. – F.P Jul 04 '16 at 09:07
  • It was said "fetch any url BETWEEN double quotes" so in my understanding - without them. So find first \", clear position - \K and then rest of regex - match url. – Michał M Jul 04 '16 at 09:13
0

Here is my suggestion (in case your regex flavour supports lookarounds):

(?<=href="|link="|src=")(((http|https)(:\/\/))?([\/\w\-]{2,})(([\.])([\w\-]*)){1,})([\w.,@?^=%&amp;:\/~+#-]*[\w@?^=%&amp;\/~+#-]*)(?=")
Maria Ivanova
  • 1,146
  • 10
  • 19
0

If you do not reduce the scope of your problem, this post : Why it's not possible to use regex to parse HTML/XML: a formal explanation in layman's terms may help you. Else, for instance if you only want the URIs after href=, you can do this like that :

/(?:href=")(.[^"]*)"/g
Community
  • 1
  • 1
M. Timtow
  • 343
  • 1
  • 3
  • 11
0

Use the following regex:

.*href="(\S*)" demo

We are effectively looking for the presence of href= and then capturing all non-whitespace characters that appear between double quotes.

CinCout
  • 9,486
  • 12
  • 49
  • 67