2

I created a function that should check if a string correspond to an url format:

const test = (str) => {
  const t = new RegExp(
    '^(https?:\\/\\/)?' + 
      '(www\\.)' + 
      '((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.)+[a-z]{2,}|)' + 
      '(\\#[-a-z\\d_]*)?$', 
    'i',
  );

  return t.test(str);
};

console.log(test('http://demo.com')); //expect true
console.log(test('http://ww.demo.com')); //expect false

For each console.log() i wrote the expected value, in both cases i got false. In the last case false is ok, but in the first i should get true. How to fix the regex?

Tom
  • 8,509
  • 7
  • 49
  • 78
Asking
  • 3,487
  • 11
  • 51
  • 106
  • Did you forget to add `?` after `(www\\.)`? That would mean it expects `www.` 0 or 1 times. The way you have it currently it is expecting `www.` to show up in the URL in order to match, and therefore both your urls fail. – Hanlet Escaño Aug 31 '21 at 18:11
  • @HanletEscaño, if i change to `'(www\\.)?'` i get true for both, but the second one should be false. Do you know houw to solve? – Asking Aug 31 '21 at 18:12
  • Why should the second one be false? `ww.` is in fact a valid subdomain, so the regex is valid. Do you want to match only URLs that have the www. subdomain and nothing else (ie. no other subdomains)? – Hanlet Escaño Aug 31 '21 at 18:13
  • @HanletEscaño, i did not find any site with this. How to know if this is valid? – Asking Aug 31 '21 at 18:15
  • If you are writing a regex to match valid URLs, this answer may help: https://stackoverflow.com/a/3809435/752527, notice that with those regexes both your urls would still be valid. – Hanlet Escaño Aug 31 '21 at 18:17
  • This question does not contain enough information about the requirements. Since `http://ww.demo.com` _is_ syntactically valid according to the common definition of URLs, OP is clearly operating with a special definition of "URL" that is not provided. Giving us a (broken) regex is not an adequate substitute for explicitly stating what your special requirements are. At a minimum, you must explain why you want the second string to fail validation. – Tom Feb 17 '22 at 04:58

1 Answers1

0

Even if this answer is a bit too much for this Problem, it illustrates the problem: Even if it might be possible to create a regexp to check the url, it is much simpler and more robust to parse the URL and "create a real Object", on/with which the overall test can be decomposed to a number of smaller tests.

So probably the builtin URL constructor of modern browsers may help you here (link1, link 2).

One approach to test you url might look like this:

function testURL (urlstring) {
var errors = [];
try {
    var url = new URL(urlstring);

    if (!/https/.test(url.protocol)) {
       errors.push('wrong protocol');
    }

    //more tests here

} catch(err) {
  //something went really wrong
  //log the error here

} finally {
  return errors;
}
 }if (testURL('mr.bean').length == 0) { runSomething(); }
Lakshmanan k
  • 122
  • 7