-3

I want a mechanism to extract the sub domain from location.hostname which should suffice all the below scenario.

 1. example.com => return value is blank since no sub domain
 2. www.example.com => return value is blank since no sub domain
 3. test.example.com => return value should be test since this is the sub domain
 4. example.co.in => return value is blank since no sub domain
 5. www.example.co.in => return value is blank since no sub domain
 6. test.example.co.in => return value should be test since this is the sub domain
 7. 183.87.46.82 => return value is blank since IP passed

For the above given scenarios only, I need to handle. I do not expect anything more then this. Most important, I do not need to extract any nested sub domain name, only 1st level sub domain name is more then enough.

Any idea in this regard would be helpful.

user850234
  • 3,373
  • 15
  • 49
  • 83
  • the following answer may help you - http://stackoverflow.com/a/23945027/2020893 – Karthik AMR Mar 08 '16 at 08:22
  • And this too - https://css-tricks.com/snippets/javascript/get-url-and-url-parts-in-javascript/ – Karthik AMR Mar 08 '16 at 08:25
  • 1
    Voting to close as SO is not a code writing service. – RobG Mar 08 '16 at 08:28
  • Then you need to know all the SLD (sub level domain) like co.uk and be able to recognise what is a ip address and what is not. The only lib i know that dose this well is [URI.js](https://medialize.github.io/URI.js/) – Endless Mar 08 '16 at 08:36

3 Answers3

1

Consider the following articles for defining valid hostnames:
https://www.rfc-editor.org/rfc/rfc952
https://www.rfc-editor.org/rfc/rfc1123
This regex should help you in your case:

var regex = /^(?!www\.|\d{1,3}\.)[a-z0-9-]+?\.[a-z0-9-]{3,}\.[a-z0-9-]+?(\.[a-z0-9-]+?)*?$/gi;

var hostname = "example.com";
console.log(hostname.match(regex));   // null

hostname = "www.example.com";
console.log(hostname.match(regex));   // null

hostname = "test.example.com";
console.log(hostname.match(regex));   // [ "test.example.com" ]

hostname = "www.example.com";
console.log(hostname.match(regex));   // null

hostname = "example.co.in";
console.log(hostname.match(regex));   // null

hostname = "www.example.co.in";
console.log(hostname.match(regex));   // null

hostname = "1test.example.co.in";
console.log(hostname.match(regex));   // [ "1test.example.co.in" ]

hostname = "187.162.10.12";
console.log(hostname.match(regex));   // null

https://jsfiddle.net/fknhumw3/

Community
  • 1
  • 1
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
0

try this:

  ["example.com",
   "www.example.com",
   "test.example.com",
   "http://example.co.in",
   "http://www.example.co.in",
   "test.example.co.in",
   "http://183.87.46.82"]
        .filter(function(url){
            return url.match(/^(?!www).*\.(.*)\.co.*$/g)
        })

update regex

^(?!www).*\.(.*)\.co.*$
Zamboney
  • 2,002
  • 12
  • 22
0

I personally do consider www to be a subdomain, and in case of 'second-level'-domains (.co.uk) I would in fact consider co the domain name and whatever comes before it would be a subdomain.

Since that doesn't really answer your question, here's an approach solely based on your input (which you will modify once you find out 'second-level' domains (that list does not cover everything) are a lot harder to detect than you think).

function subdomain(host) {
    var part = host.split('.').reverse(),
        index = 0;

    while (part[index].length === 2 || !index) {
        ++index;
    }
    ++index;

    return part.length > index && part[index] !== 'www' ? part[index] : '';
}

Working example

What this does is applying a very blunt rule that 'second-level'-domains always consist of 2x2 chars (co.uk, co.in, etc) and filter those, then skip to what is now thought to be the main domain name and skip that. If finally there is something on the index we have determined and it does not match 'www', you get it back.

This is merely an example to show you how difficult your question is, as it would require an up-to-date (as in actively maintained, curated) list of 'second-level'-domains or else you may fail.

The only thing I actually did take into consideration is that some.deep.nested.sub.domain.com will give you sub instead of some.

(Also note that I did not actively prevent the ip from matching at all, it just so happens to match the 2x2 rule).


I'm very curious about the problem you are trying to solve by trying to isolate the subdomain, as in itself it does not have any meaning. I can think of situations where you'd like to display a 'nickname' of sorts based on a subdomain, but then I recon you'd know the patterns to expect. From a technical point of view, having only the subdomain would be useless.

Rogier Spieker
  • 4,087
  • 2
  • 22
  • 25
  • Thanks...this is what I was looking for – user850234 Mar 08 '16 at 09:08
  • @user850234 please do consider all of the concerns I raised in the answer, they may not apply now, but eventually technology has a nasty habit of catching you off guard. – Rogier Spieker Mar 08 '16 at 09:11
  • Yes...I took all those into consideration...the situation which I am trying to solve is there are multiple sub domain logins in my application which the UI has to extract from the URL and then use the same sub domain name for subsequent communication with the API's and in case of invalid sub domain display the user friendly message and load the relevant screen – user850234 Mar 08 '16 at 09:45