36

I want to validate a URL of the types:

  • www.google.com
  • http://www.google.com
  • google.com

using a single regular expression, is it achievable? If so, kindly share a solution in JavaScript.

Please note I only expect the underlying protocols to be HTTP or HTTPS. Moreover, the main question on hand is how can we map all these three patterns using one single regex expression in JavaScript? It doesn't have to check whether the page is active or not. If the value entered by the user matches any of the above listed three cases, it should return true on the other hand if it doesn't it should return false.

Max
  • 1,054
  • 1
  • 12
  • 20
Arslan Sohail
  • 1,583
  • 1
  • 12
  • 18
  • http://stackoverflow.com/a/15607801/2025923 will help you – Tushar Jun 19 '15 at 06:12
  • Why not use `url` module. https://www.npmjs.com/package/url – Swaraj Giri Jun 19 '15 at 06:15
  • When you say validate, do you mean ONLY that it is a legal URL or do you mean that the URL actually works and reaches a live site? Does it have to be an absolute URL or are relative URLs alllowed? And, what protocols do you want to accept? Only http? http and https? What about other protocols? You've left a lot of questions unanswered. When you say you want to accept "google.com", what does that mean? It doesn't have a protocol. – jfriend00 Jun 19 '15 at 06:16
  • By validate i mean that URL actually works and reaches a live site,i want to work with http and https protocols only,let me know if you still needs any information. – Arslan Sohail Jun 19 '15 at 06:18
  • possible duplicate of [What is the best regular expression to check if a string is a valid URL?](http://stackoverflow.com/questions/161738/what-is-the-best-regular-expression-to-check-if-a-string-is-a-valid-url) – user2226755 Jun 19 '15 at 06:39
  • Please use the "edit" link to edit your question to clarify that you want to make sure the URL reaches a live site. None of the answers you have received so far do that because your question does not specify that. Also, if you want browser-like behavior where the browser will guess that you meant `http://` if you do not have a protocol on the URL, then also add that to your question. Your question is incomplete. You should edit it to add ALL the details about what you actually want. Good questions generally get good and complete answers. Incomplete questions don't. – jfriend00 Jun 19 '15 at 06:49
  • @SwarajGiri Can you validate an url with that module? Because it only parse a string looking for url's, it doesn't check if it is valid. – JulianSoto Jan 09 '18 at 04:47

6 Answers6

62

There is no need to use a third party library.

To check if a string is a valid URL

  const URL = require("url").URL;

  const stringIsAValidUrl = (s) => {
    try {
      new URL(s);
      return true;
    } catch (err) {
      return false;
    }
  };

  stringIsAValidUrl("https://www.example.com:777/a/b?c=d&e=f#g"); //true
  stringIsAValidUrl("invalid"): //false

Edit

If you need to restrict the protocol to a range of protocols you can do something like this

const { URL, parse } = require('url');

const stringIsAValidUrl = (s, protocols) => {
    try {
        new URL(s);
        const parsed = parse(s);
        return protocols
            ? parsed.protocol
                ? protocols.map(x => `${x.toLowerCase()}:`).includes(parsed.protocol)
                : false
            : true;
    } catch (err) {
        return false;
    }
};

stringIsAValidUrl('abc://www.example.com:777/a/b?c=d&e=f#g', ['http', 'https']); // false
stringIsAValidUrl('abc://www.example.com:777/a/b?c=d&e=f#g'); // true

Edit

Due to parse depreciation the code is simplified a little bit more. To address protocol only test returns true issue, I have to say this utility function is a template. You can adopt it to your use case easily. The above mentioned issue is covered by a simple test of url.host !== ""

const { URL } = require('url');

const stringIsAValidUrl = (s, protocols) => {
    try {
        url = new URL(s);
        return protocols
            ? url.protocol
                ? protocols.map(x => `${x.toLowerCase()}:`).includes(url.protocol)
                : false
            : true;
    } catch (err) {
        return false;
    }
};
pouya
  • 3,400
  • 6
  • 38
  • 53
  • 2
    Does not work with invalid protocol `stringIsAValidUrl("abc://www.example.com:777/a/b"); //true` – Stephane Janicaud Jun 06 '19 at 07:58
  • 5
    @StephaneJanicaud It is still valid. `abc` could be a custom protocol registered on the OS to be handled by a specific app. The question is about validiatiing a URI which this function does. If you want to restrict protocol you can do it with further parsing the URI. – pouya Jun 06 '19 at 08:15
  • 1
    The question is : "Please note i only expect the underlying protocols to be HTTP or HTTPS", you answer does not match this requirement. – Stephane Janicaud Jun 06 '19 at 08:30
  • 3
    @StephaneJanicaud not a big deal. See my updated answer. As i said before there is no need to use third party library. Node.js `url` module can parse and validate any url. – pouya Jun 06 '19 at 09:21
  • You're absolutely right, I'm also using this module to validate urls. The problem was the protocol, not the validation method. Thanx for the update. – Stephane Janicaud Jun 06 '19 at 09:31
  • No need to parse URL for protocol, just check by match `url.startsWith('http://') || url.startsWith('https://')` – jay padaliya Oct 28 '20 at 10:47
  • Your solution returns `true` for urls contains protocol only. `stringIsAValidUrl('protocol:') // true` Theoretically this is valid url but useless. And `parse` was deprecated. https://nodejs.org/api/url.html#url_url_parse_urlstring_parsequerystring_slashesdenotehost – Oleg Nov 18 '20 at 20:08
  • Still works (at least the third edited example). – openwonk Aug 24 '21 at 21:44
  • `url` just **parses** the given URL, but doesn't **validate** it. In other words, it will try its best to parse invalid URLs, and will not complain if said URLs are wrong in just the right way. For example, `http:example.org` is [invalid by WHATWG standards](https://url.spec.whatwg.org/#example-url-parsing), but `URL` can parse it just fine. But this seems to be the expected behavior for parsers (In WHATWG's words: "A validation error does not mean that the parser terminates."). – hhs Oct 30 '21 at 11:20
  • @hhs As long as a string comes out of a URL parser as a valid URL, you can always fight back against these minor glitches, you have all the tiny pieces of a URL, So you can put them back together as you need, So if slashes are essential to your special use case, write a line of code or even another function to make sure they are where they should be. As I said my solution is a pattern that can be easily extended to cover everyone's use case. – pouya Oct 31 '21 at 05:32
  • @pouya the problem is, it might not parse it "correctly", per se. If I give the parser `example.org:443`, I might except the parser to either complain of missing protocol, or parse it so that `host` is `example.org` and `port` is `443`. But `URL` will say protocol is `example.org:` and `pathname` is `443`. – hhs Oct 31 '21 at 08:42
  • @hhs `stringIsAValidUrl` is the first function in the chain of functions/methods to validate a string against a special use case. `stringIsAValidUrl` just says this string techinically could be a url. You as a developer have to go further and expand on this result to see if this string fits your requirements. For instance another fucntion like `stringIsAUsefulUrl` has to expand on the `stringIsAValidUrl` result to see if the string complies to your criteria or not. Anyway passing an array of ACCEPTABLE protocols to `stringIsAValidUrl` rejects cases like `not:aurl`. – pouya Nov 01 '21 at 06:57
51

There's a package called valid-url

var validUrl = require('valid-url');

var url = "http://bla.com"
if (validUrl.isUri(url)){
    console.log('Looks like an URI');
} 
else {
    console.log('Not a URI');
}

Installation:

npm install valid-url --save

If you want a simple REGEX - check this out

Jossef Harush Kadouri
  • 32,361
  • 10
  • 130
  • 129
  • 1
    OP apparently wants to know if the URL reaches a live site. – jfriend00 Jun 19 '15 at 06:50
  • 4
    Wouldn't recommend using a 3rd party library for this very simple problem as there is already a native solution. This just contributes to the bloated node_modules problem. – Epic Speedy Jun 04 '21 at 23:14
4

Using the url module seems to do the trick.

Node.js v15.8.0 Documentation - url module

try {
  const myURL = new URL(imageUrl);
} catch (error) {
  console.log(`${Date().toString()}: ${error.input} is not a valid url`);
  return res.status(400).send(`${error.input} is not a valid url`);
}
stackers
  • 2,701
  • 4
  • 34
  • 66
3

The "valid-url" npm package did not work for me. It returned valid, for an invalid url. What worked for me was "url-exists"

const urlExists = require("url-exists");

urlExists(myurl, function(err, exists) {
  if (exists) {
    res.send('Good URL');
  } else {
    res.send('Bad URL');
  }
});
MSi
  • 97
  • 3
  • 5
    what does good and bad url refer to? and what is res? please be more specific for the ones who doesn't use node a lot – Chris Jul 23 '18 at 21:45
  • It may be a little late, but for other people who visit first time, I suppose `good URL` means that the url is valid. Now, res stands for response and it is used in servers (like expressjs) to send a response to the client that made the request. Hope it helps. – JexSrs May 20 '22 at 13:55
2

Other easy way is use Node.JS DNS module.

The DNS module provides a way of performing name resolutions, and with it you can verify if the url is valid or not.

const dns = require('dns');
const url = require('url'); 

const lookupUrl = "https://stackoverflow.com";
const parsedLookupUrl = url.parse(lookupUrl);

dns.lookup(parsedLookupUrl.protocol ? parsedLookupUrl.host 
           : parsedLookupUrl.path, (error,address,family)=>{
   
              console.log(error || !address ? lookupUrl + ' is an invalid url!' 
                           : lookupUrl + ' is a valid url: ' + ' at ' + address);
    
              }
);

That way you can check if the url is valid and if it exists

0

I am currently having the same problem, and Pouya's answer will do the job just fine. The only reason I won't be using it is because I am already using the NPM package validate.js and it can handle URLs.

As you can see from the document, the URL validator the regular expression based on this gist so you can use it without uing the whole package.

I am not a big fan of Regular Expressions, but if you are looking for one, it is better to go with a RegEx used in popular packages.

Anas Tiour
  • 1,344
  • 3
  • 17
  • 33