0

I have the following Regex code that's taking a long time (30+ seconds) to validate large URLS:

let validReg = new RegExp('^(https?:\\/\\/)?'+ // protocol
    '((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|'+ // domain name
    '((\\d{1,3}\\.){3}\\d{1,3}))'+ // OR ip (v4) address
    '(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*'+ // port and path
    '(\\?[;&a-z\\d%_.~+=-]*)?'+ // query string
    '(\\#[-a-z\\d_]*)?$','i');
let isValid = validReg.test(component.repository_url);

How can I change this code to validate the same Regex more efficiently?

  • 1
    Why is this tagged as Python? This looks like JavaScript. – BrokenBenchmark Mar 06 '22 at 03:36
  • If it was Python, you'd skip the regex and use a dedicated URL parsing API, e.g. [urllib.parse.urlparse](https://docs.python.org/3/library/urllib.parse.html#urllib.parse.urlparse). Parsing complicated formats with a regex is typically a terrible idea (even if it works, it's rarely maintainable). – ShadowRanger Mar 06 '22 at 03:46
  • @BrokenBenchmark thanks for pointing out the mistake. this is Javascript. – web-dev-nerd Mar 06 '22 at 07:04
  • 1
    Since it is JavaScript one would use parser based approaches as well ... the [Web API](https://developer.mozilla.org/en-US/docs/Web/API) provides [`URL`](https://developer.mozilla.org/en-US/docs/Web/API/URL) and [`URLSearchParams`](https://developer.mozilla.org/en-US/docs/Web/API/URLSearchParams) – Peter Seliger Mar 06 '22 at 11:33
  • 1
    Does this answer your question? [What is the best regular expression to check if a string is a valid URL?](https://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url) – Peter Seliger Mar 06 '22 at 13:42

1 Answers1

1

You need to refactor the part where you match hyphen separated word chars: [a-z\\d]([a-z\\d-]*[a-z\\d])* => [a-z\\d]+(?:-+[a-z\\d]+)*.

Also, note you do not need to escape / chars, they are not special regex metacharacters, and you are not using a regex literal.

You may use

let validReg = new RegExp('^(https?://)?'+ // protocol
    '((([a-z\\d]+(?:-+[a-z\\d]+)*)\\.)+[a-z]{2,}|'+ // domain name
    '((\\d{1,3}\\.){3}\\d{1,3}))'+ // OR ip (v4) address
    '(:\\d+)?(/[-a-z\\d%_.~+]*)*'+ // port and path
    '(\\?[;&a-z\\d%_.~+=-]*)?'+ // query string
    '(#[-a-z\\d_]*)?$','i');
let isValid = validReg.test(component.repository_url);
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563