1

I am looking for a way to retrieve the main domain (naked) from a random domain/subdomain. What I am looking for is not a sed or awk command (as the domain is random) but some string with dig, host or nslookup that can actually show the naked domain. Any suggestion?

Example:

from www.bbc.co.uk -> bbc.co.uk
from www.google.com -> google.com
from subdomain.google.co.uk -> google.co.uk
from subdomain.ofasubdomain.google.com.au -> google.com.au 


Porcac1x
  • 105
  • 9
  • If you use a network tool like dig or nslookup, it's not sure you will get consistent results, since in some case www.domain.com is a CNAME of domain.com, in some other case they are both A records. It could also be that domain.com is a CNAME record pointing to www.domain.com – Francesco Gasparetto Nov 06 '19 at 08:49
  • This is the reason why I need a network tool instead of sed or awk as the www. can be not there or the domain used can be bla.bla.bla.bbc.co.uk. I need a way to retrieve the naked domain only – Porcac1x Nov 06 '19 at 08:54
  • I understand, but still "naked domain" is a variable concept, not exactly connected with a DNS entity. Which problem do you have if you take the last 3 parts of the string, with the dot (.) as separator? Which problems would you have with that solution? – Francesco Gasparetto Nov 06 '19 at 09:01
  • that the domain can be just google.com or www.google.com and so on so the number of dots can vary – Porcac1x Nov 06 '19 at 09:15
  • @Porcac1x, the top domain in internet is `.` And clarify your understanding of "main domain". It differ a lot per country! Can be `bbc.co.uk` but can be `bbc.com` or `bbc.de` – Romeo Ninov Nov 06 '19 at 09:44
  • @RomeoNinov the ones you listed are different domains. I am referring to the top naked domain, the one you buy from a Registrar. www.google.com -> google.com subdomain.google.co.uk -> google.co.uk subdomain.ofasubdomain.google.com.au -> google.com.au – Porcac1x Nov 06 '19 at 10:16
  • @Porcac1x, for me this is XY problem. Please clarify your big problem. – Romeo Ninov Nov 06 '19 at 10:20
  • That's it, I just need to print the main domain of the one provided by the user – Porcac1x Nov 06 '19 at 10:23

1 Answers1

1

I'm not an expert on domain names - Based on https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains, with minor exception, all domains with 2 letter suffix will have main domain of something.bb.cc, and all other suffix (usually 3 letters), the main domain will be something.ccc

Using bash

domain=...
md=
p2='^(.*\.)?([^.]+\.[a-z]+\.[a-z][a-z])$'
p3='^(.*\.)?([^.]+\.(com|org|net|int|edu|gov|mil))$'
px='^(.*\.)([a-z]+)$'

   # 2 letter country codes
if [[ "$domain" =~ $p2 ]] ; then
    md=${BASH_REMATCH[2]};
   # 3 letters legacy domain
elif [[ "$domain" =~ $p3 ]] ; then
    md=${BASH_REMATCH[2]};
   # All Other
elif [[ "$domain" =~ $px ]] ; then
    md=${BASH_REMATCH[2]};

fi ;
echo "$domain -> $md"

Could extend to handle few 4 letter domain

dash-o
  • 13,723
  • 1
  • 10
  • 37
  • Thanks for this, I made a test with a .com.au domain but the script returns just .au – Porcac1x Nov 06 '19 at 11:03
  • @Porcac1x REGEX fixed to accept just top level domain (was not part of the question). Code assume valid domain is provided, and will not work parent domains (e.g., .com). for Feel free to adjust as needed, and post fixes. – dash-o Nov 06 '19 at 11:17
  • What about those domains? https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains#ICANN-era_generic_top-level_domains – Romeo Ninov Nov 06 '19 at 12:30
  • @RomeoNinov Do you have a suggestion ? I'm not an expert on topic. – dash-o Nov 06 '19 at 12:58
  • @dash-o, for me the question is still VERY unclear. Because of those details above. And because the OP assume two letter domains have one more subdomain before the actual one. He/she do not think about domains like `web.de` for example – Romeo Ninov Nov 06 '19 at 13:01
  • I think the problem cannot have a syntax based solution, now I understand why you asked for a network tool solution – Francesco Gasparetto Nov 06 '19 at 22:31
  • Hi guys, the solution provided from dash-o worked fine however the team that needs to use the script is not comfortable on editing the script and add any new domain TLD they meet. So I go back to my original question, does anybody know a netwoork tool that returns the main domain if you provide a subdomain? I got it using dig and trace but it looks very hard to find a way to grep just that results – Porcac1x Nov 07 '19 at 07:22
  • What about domain sub1.sub2.domain.us ? This will return sub2.domain.us. which is not the right answer. – ToiletGuy Feb 06 '21 at 10:45