5

Given a string with URLs in the following formats:

https://www.cnn.com/
http://www.cnn.com/
http://www.cnn.com/2012/02/16/world/american-nicaragua-prison/index.html
http://edition.cnn.com/?hpt=ed_Intl

W JS/jQuery, how can I extract from the string just cnn.com for all of them? Top level domain plus extension?

Thanks

AnApprentice
  • 108,152
  • 195
  • 629
  • 1,012

5 Answers5

3
​var loc = document.createElement('a');

loc.href = 'http://www.cnn.com/2012/02/16/world/index.html';

​window.alert(loc.hostname);​ // alerts "cnn.com"

Credits for the previous method:

Creating a new Location object in javascript

Community
  • 1
  • 1
Alex
  • 34,899
  • 5
  • 77
  • 90
  • hence: `loc.hostname.split('.').slice(-2).join('.')` but you're still gonna get screwed on the whole `.co.uk` type of domain... – tkone Feb 20 '12 at 13:11
1
function domain(input){
    var matches,
        output = "",
        urls = /\w+:\/\/([\w|\.]+)/;

    matches = urls.exec(input);

    if(matches !== null){
        output = matches[1];
    }

    return output;
}
chrishawn
  • 478
  • 1
  • 4
  • 12
0
var domain = location.host.split('.').slice(-2);

If you want it reassembled:

var domain = location.host.split('.').slice(-2).join('.');

But this won't work with co.uk or something. There's no hard nor fast rule for this, not even regex will determine that.

tkone
  • 22,092
  • 5
  • 54
  • 78
  • ... where `location.host` is the subject string – satoshi Feb 20 '12 at 00:26
  • @satoshi `location.host` is the string property of the location object. It refers to the data between the '//' and the first '/' in your URL. It's in browsers back to IE6 and I believe even Netscape 4. – tkone Feb 20 '12 at 00:29
  • This doesn't work at all. `"http://www.cnn.com/2012/02/16/world/american-nicaragua-prison/index.html".split('.').slice(-2).join('.')` -> `"com/2012/02/16/world/american-nicaragua-prison/index.html"` – Dagg Nabbit Feb 20 '12 at 00:31
  • @ggg location is a browser object. Location.host is the server URL. Try that in the chrome debugger or firebug, etc – tkone Feb 20 '12 at 00:32
  • What I'm saying is that the question author didn't ask to have this regex applied to the current page... – satoshi Feb 20 '12 at 00:46
  • ah, you're right, my mistake! I somehow read that as `location.href`. By the way, this doesn't work with TLDs with dots in them (`co.uk` for example). @satoshi it could be used in conjunction with a trick like Xander proposes below (if it worked with dotted TLDs). – Dagg Nabbit Feb 20 '12 at 00:48
0

Given that there are top-level domains with dots in them, for example "co.uk", there's no way to do this programatically unless you include a list of all of the TLDs with dots in them.

Dagg Nabbit
  • 75,346
  • 19
  • 113
  • 141
-1
// something.domain.com -> domain.com
function getDomain() {
  return window.location.hostname.replace(/([a-z]+.)/,"");
}
Gubatron
  • 6,222
  • 5
  • 35
  • 37
  • About your regular expression. A. you don't need to capture a group here. B. you're trying to match a . with a . that matches any other character too (except \n) So this should be: window.location.hostname.replace(/[a-z]+\./,"") – Christiaan Westerbeek May 23 '14 at 12:25