0

I have an API that returns a domain to my front end.

This domain is the string format.

For eg: "google.com" / "google.co.ok"
or "test.google.com"/ "test.google.co.ok:

Notice that the string does not contain any protocol.

I want to write a method that parses the string and returns true if the string contains a subdomain.

In the above 2 examples, the method should return true for test.google.com or test.google.co.ok

EDIT: If it were python, i would write something like below. But hoping something similat was available in JS.

from tld import get_tld, get_fld

get_tld("www.google.co.uk", fix_protocol=True)
# 'co.uk'

get_fld("www.google.co.uk", fix_protocol=True)
# 'google.co.uk'
newbie
  • 1,023
  • 20
  • 46
  • 1
    Since there's no protocol, maybe something like `"word.domain.co.uk/something".split("/")[0].split(",").length > 2` – Daniel Szabo Jan 08 '22 at 23:05
  • @Dexygen See my edit. I know how to get answer if I were to use Python – newbie Jan 08 '22 at 23:17
  • Have a look at [Get the domain name of the subdomain Javascript](https://stackoverflow.com/q/13367376/1048572) then. Is that enough to answer your question? – Bergi Jan 08 '22 at 23:18
  • @DanielSzabo's solution seems to work. can you add that as answer please – newbie Jan 08 '22 at 23:47
  • 2
    You didn't write a Python method, you used a method from a library – Dexygen Jan 09 '22 at 00:07
  • I think you need to define what a "subdomain" is (or is not), more rigorously. [rfc1034 section 3.1](https://datatracker.ietf.org/doc/html/rfc1034#section-3.1) says _A domain is a subdomain of another domain if it is contained within that domain._ This suggests that even "com" is a subdomain of the root domain " ". – Wyck Jan 09 '22 at 00:44
  • Did you look into the implementation of [the python library you were using](https://pypi.org/project/tld/)? – Bergi Jan 09 '22 at 00:53
  • @Wyck Yes, `example.com` technically is only a relative domain, which usually refers to the unambiguous `example.com.` domain. – Bergi Jan 09 '22 at 00:56

3 Answers3

1

There are multiple JavaScript libraries available that can be used the same way you're using tld. psl is older but still has millions of weekly downloads.

You could use psl and implement something like this:

import { parse } from "psl";

function hasSubdomain(str) {
  const { subdomain } = parse(str);
  
  return subdomain !== null;
}

hasSubdomain("www.google.com") // true
hasSubdomain("google.co.uk") // false

Feel free to clone and edit this example on RunKit as you see fit.

Ezra
  • 1,118
  • 9
  • 13
0

Sure thing. Since there's no protocol, maybe something like:

"word.domain.com"
  .split(".").length > 2 // true

"domain.com"
  .split(".").length > 2 // false

"www.domain.co.uk"
  .split(".").length > 2 // uh-oh

You'll likely need to parse out "www" and second-level domains (".co", ".gc", etc).

Daniel Szabo
  • 7,181
  • 6
  • 48
  • 65
  • For those "multi-part top-level domains" (I'd rather call them "effective" tlds), see the [public suffix list](https://publicsuffix.org) – Bergi Jan 09 '22 at 00:47
  • hmmm..parsing out www makes sense..but keeping track of second-level domain could be challenging – newbie Jan 09 '22 at 00:53
-2

You can use RegExp to perform string manipulation. Please take a look at the following snippet and run the code and see the results from different test cases covering most of the possibilities. Let me know if it's helpful.

function subDomain(url) {
  // REMOVE LEADING AND TRAILING WHITE SPACE 
  url = url.replace(new RegExp(/^\s+/), ""); // START
  url = url.replace(new RegExp(/\s+$/), ""); // END

  // CONVERT BACK SLASHES TO FORWARD SLASHES
  url = url.replace(new RegExp(/\\/g), "/");

  // REMOVES 'www.' FROM THE START OF THE STRING
  url = url.replace(new RegExp(/^www\./i), "");

  // REMOVE STRING FROM FIRST FORWARD SLASH ON
  url = url.replace(new RegExp(/\/(.*)/), "");

  // REMOVES '.??.??' OR '.???.??' FROM END - e.g. '.CO.UK', '.COM.AU'
  if (url.match(new RegExp(/\.[a-z]{2,3}\.[a-z]{2}$/i))) {
    url = url.replace(new RegExp(/\.[a-z]{2,3}\.[a-z]{2}$/i), "");

    // REMOVES '.??' or '.???' or '.????' FROM END - e.g. '.US', '.COM', '.INFO'
  } else if (url.match(new RegExp(/\.[a-z]{2,4}$/i))) {
    url = url.replace(new RegExp(/\.[a-z]{2,4}$/i), "");
  }

  // CHECK TO SEE IF THERE IS A DOT '.' LEFT
  var subDomain = url.match(new RegExp(/\./g)) ? true : false;

  return subDomain;
}

const subdomainInput = "test.google.com";
const subdomainInputWithPath = "test.google.com/test";
const subdomainInputWithPathWithWS = "    test.google.com    ";
const subdomainInputWithWS = "   test.google.com    ";
const subdomainInputWithQueryString = "test.google.com/test?token=33333";
const noSubInput = "google.com"
const noSubInputWithPath = "google.com/search"
const noSubInputWithPathWithQueryString = "google.com/search?token=ttttttt"
console.log("Test Run\n")
conosle.log("With subdomain test cases")
console.log(`subdomainInput: ${subDomain(subdomainInput)}`);
console.log(`subdomainInputWithPath: ${subDomain(subdomainInputWithPath)}`);
console.log(`subdomainInputWithWS: ${subDomain(subdomainInputWithWS)}`);
console.log(`subdomainInputWithPathWithWS: ${subDomain(subdomainInputWithPathWithWS)}`);
console.log(`subdomainInputWithQueryString: ${subDomain(subdomainInputWithQueryString)}`);

conosle.log("Without subdomain test cases")
console.log(`noSubInput: ${subDomain(noSubInput)}`);
console.log(`noSubInput: ${subDomain(noSubInput)}`);
console.log(`noSubInputWithPath: ${subDomain(noSubInputWithPath)}`);
console.log(`noSubInputWithPathWithQueryString: ${subDomain(noSubInputWithPathWithQueryString)}`);

return(subDomain);

}