76

How do I test to see if links are external or internal? Please note:

  1. I cannot hard code the local domain.
  2. I cannot test for "http". I could just as easily be linking to my own site with an http absolute link.
  3. I want to use jQuery / javascript, not css.

I suspect the answer lies somewhere in location.href, but the solution evades me.

Thanks!

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
Matrym
  • 16,643
  • 33
  • 95
  • 140

17 Answers17

74

I know this post is old but it still shows at the top of results so I wanted to offer another approach. I see all the regex checks on an anchor element, but why not just use window.location.host and check against the element's host property?

function link_is_external(link_element) {
    return (link_element.host !== window.location.host);
}

With jQuery:

$('a').each(function() {
    if (link_is_external(this)) {
        // External
    }
});

and with plain javascript:

var links = document.getElementsByTagName('a');
for (var i = 0; i < links.length; i++) {
    if (link_is_external(links[i])) {
        // External
    }
}
Korvin Szanto
  • 4,531
  • 4
  • 19
  • 49
Daved
  • 2,082
  • 1
  • 18
  • 23
  • 8
    This is the only answer which makes sense to me -- the others are way overengineered. Are there any arguments against this method? – tremby Oct 08 '14 at 01:01
  • The jQuery above works for me in Safari, and handles all of the issues that the others handle -- local anchors, relative URLs, cross-protocol URLs, etc. Note: http://www.example.com and https://www.example.com will be marked as internal; that may be important to you. – Mark Jun 17 '15 at 00:05
  • 1
    This is the best answer. This makes use of the `a.host` property which is probably unknown to the average JavaScript developer (including myself before reading this). – dana Nov 10 '15 at 17:47
  • Perfect, just what I was looking for! Clean and simple without all the regex hassle which isn't needed. – Nicholas Aug 23 '16 at 12:56
  • Works flawlessly. Best answer. – Chololoco Apr 19 '17 at 22:04
  • 1
    this will fail when port number is specified, also if "www" is left out – Andrew May 05 '17 at 00:17
  • 1
    I'm a little late to the follow up here, but I would argue to @Andrew that "www" is an identifier that would indicate a different match. And though the domain might be the same, the host would be different with WWW and I think that's a valid condition. The port can be checked by using a ".port" comparison as well if they both != ''. – Daved Apr 05 '18 at 04:16
  • @tremby Yes, I have just one argument against it, it depends on `a` DOM node. Imagine that I can have only the `href` `string`, I wouldn't be able to test it without creating a ghost node for it. – giovannipds Mar 16 '20 at 15:02
  • @Daved Looking back, maybe testing `.origin` of each would make most sense. That'll include the scheme and port. – tremby Mar 16 '20 at 22:06
  • @giovannipds, you can use `new URL(myHrefString`) instead of a new DOM node. – tremby Mar 16 '20 at 22:07
66
var comp = new RegExp(location.host);

$('a').each(function(){
   if(comp.test($(this).attr('href'))){
       // a link that contains the current host           
       $(this).addClass('local');
   }
   else{
       // a link that does not contain the current host
       $(this).addClass('external');
   }
});

Note: this is just a quick & dirty example. It would match all href="#anchor" links as external too. It might be improved by doing some extra RegExp checking.


Update 2016-11-17

This question still got a lot of traffic and I was told by a ton of people that this accepted solution will fail on several occasions. As I stated, this was a very quick and dirty answer to show the principal way how to solve this problem. A more sophisticated solution is to use the properties which are accessible on a <a> (anchor) element. Like @Daved already pointed out in this answer, the key is to compare the hostname with the current window.location.hostname. I would prefer to compare the hostname properties, because they never include the port which is included to the host property if it differs from 80.

So here we go:

$( 'a' ).each(function() {
  if( location.hostname === this.hostname || !this.hostname.length ) {
      $(this).addClass('local');
  } else {
      $(this).addClass('external');
  }
});

State of the art:

Array.from( document.querySelectorAll( 'a' ) ).forEach( a => {
    a.classList.add( location.hostname === a.hostname || !a.hostname.length ? 'local' : 'external' );
});
jAndy
  • 231,737
  • 57
  • 305
  • 359
  • This works fairly well, although I'm going to hold off on an answer in case someone else has a more elegant solution. Out of curiosity, why do anchor links register as external? – Matrym May 26 '10 at 08:05
  • 7
    I'm pretty sure that this will not work with relative urls. `attr` is supposed to return the attribute, not the property (the property might be resolved, not the attribute). – Sean Kinsey May 26 '10 at 10:12
  • 10
    http://jsfiddle.net/zuSeh/ It is verified that this method does not work for relative urls. – Sean Kinsey May 26 '10 at 10:14
  • Just ran into this and want to confirm that this code indeed has a problem with relative urls. So +1 on Sean and Damian. – Grimace of Despair Jul 25 '12 at 10:39
  • 4
    It will work for relative too if you use the href property instead of the href attribute. – Kevin B Jun 06 '14 at 15:29
  • 2
    I've seen people still viewing this question and solution so I wanted to link to my proposed solution below which should handle all checks without RegEx or relative issues: http://stackoverflow.com/a/18660968/2754848 – Daved Oct 14 '14 at 19:59
  • jAndy's solution will also fail (false detection of internal link) in the case that an external URL contains the hostname, e.g. http://external-site.com/domaintools/original.hostname.com/stats or something like that. Overall I favour @Daved's solution as the most elegant and most direct check—far less error prone when the browser has already resolved the URL and you're checking its resolved hostname! – Laogeodritt Nov 16 '16 at 18:50
  • 1
    Updated answer. – jAndy Nov 16 '16 at 23:49
  • @jAndy I believe that, correct me if I'm wrong, there is no need for using `Array.from` in last part of your answer, instead just wrap the expression in parens to become `(document.querySelectorAll( 'a' )).forEach` – Mo Ali May 30 '18 at 20:32
  • 1
    @MoAli At the time I wrote that, most browsers didn't provide the `forEach` method on the `HTMLElement prototype`. I still wouldn't dare to directly call that even today to be honest. Of course there are still some versions and mobile browsers which also don't support `Array.from`, but chances are way higher here, than `.forEach` being on the prototype. – jAndy May 30 '18 at 23:47
  • @jAndy - Thanks for this solution. one more point is that if the href has relative URL having # in it, then it is not working. may you please give a solution for the same? – Rahul J. Rane Sep 19 '19 at 10:03
  • @jAndy, why do you check "!this.hostname.length". In chrome it's always not empty even if href does not have hostname. Can it be empty in some browser? – Vincente Dec 24 '19 at 14:34
  • 1
    @Vincente I just did a quick check here on the SO site... If you filter all results from `document.querySelectorAll( 'a' ).forEach( anchor => {});` for `if( !anchor.hostname.length)` you get lots of results. For instance anchors which only have a `name` or `class` but no `href`. – jAndy Dec 27 '19 at 00:42
  • @jAndy, thank you. I also tested it in IE11, link with href="/page1" does not have hostname property, unlike in Chrome. – Vincente Dec 28 '19 at 12:22
37

And the no-jQuery way

var nodes = document.getElementsByTagName("a"), i = nodes.length;
var regExp = new RegExp("//" + location.host + "($|/)");
while(i--){
    var href = nodes[i].href;
    var isLocal = (href.substring(0,4) === "http") ? regExp.test(href) : true;
    alert(href + " is " + (isLocal ? "local" : "not local"));
}

All hrefs not beginning with http (http://, https://) are automatically treated as local

Sean Kinsey
  • 37,689
  • 7
  • 52
  • 71
  • 1
    This answer is more accurate. Relatives URL are also important. – Savageman May 26 '10 at 22:37
  • If I'm not mistaken, "($|/" should actually be "($|/)" with a closing brace – Grimace of Despair Jul 25 '12 at 08:08
  • 2
    This is close to the solution but you should also check if the href property does begin with `location.protocol+'//'+location.host`. Check this fiddle: http://jsfiddle.net/framp/Ag3BT/1/ – framp May 12 '13 at 22:21
  • 2
    Why are you doing this with a while loop? Seems to me like it would make more sense to use event delegation via $(document).on('click', 'a', function({}); and test the specific link that was clicked (at the point of being clicked). That way, you don't needlessly loop through all the links on the page, and it will allow for any elements added to the page via ajax after the initial DOM ready... There's actually a point to using jQuery sometimes (beyond being a "fanboy"). – 1nfiniti May 16 '13 at 22:13
  • 7
    This won't work if you use protocol agnostic urls, ie: `href="//somedomain.com/some-path"` – rossipedia Jan 14 '14 at 21:10
9
var external = RegExp('^((f|ht)tps?:)?//(?!' + location.host + ')');

Usage:

external.test('some url'); // => true or false
James
  • 109,676
  • 31
  • 162
  • 175
7

Here's a jQuery selector for only external links:

$('a[href^="(http:|https:)?//"])') 

A jQuery selector only for internal links (not including hash links within the same page) needs to be a bit more complicated:

$('a:not([href^="(http:|https:)?//"],[href^="#"],[href^="mailto:"])')

Additional filters can be placed inside the :not() condition and separated by additional commas as needed.

http://jsfiddle.net/mblase75/Pavg2/


Alternatively, we can filter internal links using the vanilla JavaScript href property, which is always an absolute URL:

$('a').filter( function(i,el) {
    return el.href.indexOf(location.protocol+'//'+location.hostname)===0;
})

http://jsfiddle.net/mblase75/7z6EV/

Blazemonger
  • 90,923
  • 26
  • 142
  • 180
  • +1: OK there was a way to up-vote you again for this answer :) – iCollect.it Ltd Jun 06 '14 at 15:43
  • First one doesn't work as "//code.jquery.com/jquery.min.js" is a completely legit URL, but not internal. The protocol and colon are not required, as the browser will use whatever the current site is using (a semi-sorta protocol relative URL). – mikesir87 Oct 21 '14 at 19:04
  • FYI, your external selector gives an error: `Uncaught Error: Syntax error, unrecognized expression: a[href^="(http:|https:)?//"])` – kthornbloom May 05 '16 at 16:05
6

You forgot one, what if you use a relative path.

forexample: /test

        hostname = new RegExp(location.host);
            // Act on each link
            $('a').each(function(){

            // Store current link's url
            var url = $(this).attr("href");

            // Test if current host (domain) is in it
            if(hostname.test(url)){
               // If it's local...
               $(this).addClass('local');
            }
            else if(url.slice(0, 1) == "/"){
                $(this).addClass('local'); 
            }
            else if(url.slice(0, 1) == "#"){
                // It's an anchor link
                $(this).addClass('anchor'); 
            }
            else {
               // a link that does not contain the current host
               $(this).addClass('external');                        
            }
        });

There are also the issue of file downloads .zip (local en external) which could use the classes "local download" or "external download". But didn't found a solution for it yet.

  • Not all relative URLs start with `/`. You can reference something like `images/logo.png` which is one folder down from your current location. In that case you're referencing a relative path in your relative URL, it will be a different meaning in different directories on your site. `/images/logo.png` is an absolute path of whatever site it's running on (hence the relativity). Your code will not include relative paths like `images/logo.png`. – Adam Plocher May 08 '14 at 21:54
6
const isExternalLink = (url) => {
    const tmp = document.createElement('a');
    tmp.href = url;
    return tmp.host !== window.location.host;
};

// output: true
console.log(isExternalLink('https://foobar.com'));
console.log(isExternalLink('//foobar.com'));

// output: false
console.log(isExternalLink('https://www.stackoverflow.com'));
console.log(isExternalLink('//www.stackoverflow.com'));
console.log(isExternalLink('/foobar'));
console.log(isExternalLink('#foobar'));

The benefit of using this approach is that:

  • It would automatically resolve the hostname for relative paths and fragments;
  • It works with protocol-relative URLs

To demonstrate this, let's look at the following examples:

const lnk = document.createElement('a');
lnk.href = '/foobar';

console.log(lnk.host); // output: 'www.stackoverflow.com'
const lnk = document.createElement('a');
lnk.href = '#foobar';

console.log(lnk.host); // output: 'www.stackoverflow.com'
const lnk = document.createElement('a');
lnk.href = '//www.stackoverflow.com';

console.log(lnk.host); // output: 'www.stackoverflow.com'
Saeed Shahbazi
  • 139
  • 2
  • 6
4

With jQuery

jQuery('a').each(function() {
    if (this.host !== window.location.host) {
        console.log(jQuery(this).attr('href'));
    }
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
  • 2
    A good answer will always include an explanation why this would solve the issue, so that the OP and any future readers can learn from it. – Tyler2P Dec 26 '21 at 11:28
3

You can use is-url-external module.

var isExternal = require('is-url-external');
isExternal('http://stackoverflow.com/questions/2910946'); // true | false 
mrded
  • 4,674
  • 2
  • 34
  • 36
2

/**
     * All DOM url
     * [links description]
     * @type {[type]}
     */
    var links = document.querySelectorAll('a');
    /**
     * Home Page Url
     * [HomeUrl description]
     * @type {[type]}
     */
    var HomeUrl = 'https://stackoverflow.com/'; // Current Page url by-> window.location.href

    links.forEach(function(link) {
        link.addEventListener('click', function(e) {
            e.preventDefault();

            // Make lowercase of urls
            var url = link.href.toLowerCase();
            var isExternalLink = !url.includes(HomeUrl);

            // Check if external or internal
            if (isExternalLink) {
                if (confirm('it\'s an external link. Are you sure to go?')) {
                    window.location = link.href;
                }
            } else {
                window.location = link.href;
            }
        })
    })
<a href="https://stackoverflow.com/users/3705299/king-rayhan">Internal Link</a>
<a href="https://wordpress.stackexchange.com/">External Link</a>
King Rayhan
  • 2,287
  • 3
  • 19
  • 23
2

This should work for any kind of link on every browser except IE.

// check if link points outside of app - not working in IE
                try {
                    const href = $linkElement.attr('href'),
                        link = new URL(href, window.location);

                    if (window.location.host === link.host) {
                        // same app
                    } else {
                        // points outside
                    }
                } catch (e) { // in case IE happens}
fuuchi
  • 59
  • 6
  • Note to `new URL(href, window.location)`: Argument of type 'Location' is not assignable to parameter of type 'string | URL | undefined'. – Cezary Tomczyk Mar 27 '21 at 09:36
0

Yes, I believe you can retrieve the current domain name with location.href. Another possibility is to create a link element, set the src to / and then retrieving the canonical URL (this will retrieve the base URL if you use one, and not necessarily the domain name).

Also see this post: Get the full URI from the href property of a link

Community
  • 1
  • 1
Savageman
  • 9,257
  • 6
  • 40
  • 50
0

For those interested, I did a ternary version of the if block with a check to see what classes the element has and what class gets attached.

$(document).ready(function () {
    $("a").click(function (e) {

        var hostname = new RegExp(location.host);
        var url = $(this).attr("href");

        hostname.test(url) ?
        $(this).addClass('local') :
        url.slice(0, 1) == "/" && url.slice(-1) == "/" ?
        $(this).addClass('localpage') :
        url.slice(0, 1) == "#" ?
        $(this).addClass('anchor') :
        $(this).addClass('external');

        var classes = $(this).attr("class");

        console.log("Link classes: " + classes);

        $(this).hasClass("external") ? googleAnalytics(url) :
        $(this).hasClass("anchor") ? console.log("Handle anchor") : console.log("Handle local");

    });
});

The google analytics bit can be ignored but this is where you'd probably like to do something with the url now that you know what type of link it is. Just add code inside the ternary block. If you only want to check 1 type of link then replace the ternaries with an if statement instead.

Edited to add in an issue I came across. Some of my hrefs were "/Courses/" like so. I did a check for a localpage which checks if there is a slash at the start and end of the href. Although just checking for a '/' at the start is probably sufficient.

0

I use this function for jQuery:

$.fn.isExternal = function() {
  var host = window.location.host;
  var link = $('<a>', {
    href: this.attr('href')
  })[0].hostname;
  return (link !== host);
};

Usage is: $('a').isExternal();

Example: https://codepen.io/allurewebsolutions/pen/ygJPgV

0

This doesn't exactly meet the "cannot hardcode my domain" prerequisite of the question, but I found this post searching for a similar solution, and in my case I could hard code my url. My concern was alerting users that they are leaving the site, but not if they are staying on site, including subdomains (example: blog.mysite.com, which would fail in most of these other answers). So here is my solution, which takes some bits from the top voted answers above:

Array.from( document.querySelectorAll( 'a' ) ).forEach( a => {
  a.classList.add( a.hostname.includes("mywebsite.com") ? 'local' : 'external' );
});

$("a").on("click", function(event) {
  if ($(this).hasClass('local')) {
    return;
  } else if ($(this).hasClass('external')) {
    if (!confirm("You are about leave the <My Website> website.")) {
      event.preventDefault();
    }
  }
});
Jacob Hixon
  • 133
  • 1
  • 9
0

this works for me:

function strip_scheme( url ) {
    return url.replace(/^(?:(?:f|ht)tp(?:s)?\:)?\/\/(www\.)?/g, '');
  }

  function is_link_external( elem ) {
    let domain = strip_scheme( elem.attr('href') );
    let host = strip_scheme( window.location.host );

    return ! domain.indexOf(host) == 0;
  }
erom
  • 1
0

As of 2023, doing this did the job for me

export function isInternalLink(urlString: string): boolean | {
  pathname: string;
} {
  const url = new URL(urlString);
  if (url.origin !== window.location.origin) {
    return false;
  }
  return {
    pathname: url.pathname,
  };
}

Phạm Huy Phát
  • 783
  • 9
  • 17