0

I need a RegEx used in JavaScript to match some URL rules:

  • Should start with http, https, ftp, ftps or mailto: (required)
  • www is optional
  • a-z, A-Z, 0-9 and some special characters .:-/ are allowed

Since I'm not that familiar with RegEx I tried to use this one found in an answer (second answer @foufos): What is a good regular expression to match a URL?

/^(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9/]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9/]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9/]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})

Now this RegEx matches all links I need except two: e.g. http://intranet/index.html mailto:sample@sample.com

So I tried to modify it and added the mailto: rule:

/^((http(s)?)|(ftp(s)?):\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,}|(mailto:){1}([\w\.]+)\@{1}[\w]+\.[\w]{2,})\s$/gm;

At the moment, two things are not working:

  • This url matches: www.google.com, but it should not, since it has to start with http, https, ftp, ftps or mailto: (required). I just putted the ? after the (s), so only this should be optional. Why does this not work?

  • This url still does not match: http://intranet/index.html but I thought, I have the right rule for special chars .:-/.

TESTING: List of URL which should match:

  • https://www.sample.com
  • https://www.sample-sample.com
  • http://www.sample.com
  • http://www.sample-sample.com
  • https://sample-sample.com
  • http://sample-sample.com
  • ftps://sample-sample.com,
  • ftp://sample-sample.com
  • ftps://sample.sample.com:3000
  • ftp://sample.sample.com:3000
  • http://intranet/index.html
  • mailto:sample@sample.com

List of URL should not match: - www.google.com

Any inputs?

webta.st.ic
  • 4,781
  • 6
  • 48
  • 98
  • Why I got downvoted on this one? I explained what I tried: To force the start of the string, I just puttet the `?` for the `s` in `https` and `ftps` so just the `s` should be optional but one of the others (http, ftp) are required and it does not work since `www.google.com` is still matching? Can't figure out why I got this downvote over here... – webta.st.ic Jun 25 '19 at 12:11
  • could you provide a list of a few urls that need to match so we can copy and paste it for testing purposes. Also, do you need to check the VALIDITY of the email? as in - if it's a valid email string? – Smytt Jun 25 '19 at 12:17

2 Answers2

0

The simplest Regexp:

^(((https?|ftps?):\/\/)|(mailto:))[a-zA-Z0-9\.:\/@-]*$

You can play with it here https://regex101.com/r/F1SqQo/1

It doesn't check if the URL follows all the rules nor the email address validity.

strah
  • 6,702
  • 4
  • 33
  • 45
  • sorry for downvoting but this does not validate against what the user asked for - for one, you have combined http and https. The match will also be cluttered with hundreds of matched groups, which the user did not ask for. – Smytt Jun 25 '19 at 12:32
  • What do you mean "combined http and https"? Also not sure about hundreds of matching groups... – strah Jun 25 '19 at 12:39
  • the way you use the `|` symbol - it's valid for the 2 groups surrounding it - so it means - `s` or `f`. You should have wrapped https / ftps in groups and then untrack the groups with `?:`. Just paste your regex in regex 101 and see that it doesn't match the provided examples. – Smytt Jun 25 '19 at 13:06
  • Yeah, it had some minor errors (not escaped slashes) - it matches all the sample URLs – strah Jun 25 '19 at 13:24
0

This regex matches your strings but DOES NOT valdiate them. This means - it still includes all forbiden characters - like comas. The email doesn't have to have @ - it just looks for non-whitespace characters till the end of each line:

(?:(?:(?:(?:http(?:s)?)|(?:ftp(?:s)?)):\/\/)|(?:mailto:))[^\s]*
Smytt
  • 364
  • 1
  • 11