So I have written a small function to remove sub-domains (if any) from string of input domains:
def rm(text):
print(text.replace(text, '.'.join(text.split('.')[-2:])), end="")
print("\n")
if __name__ == "__main__":
rm("me.apple.com")
rm("not.me.apple.com")
rm("really.not.me.apple.com")
# problem here
rm("bbc.co.uk")
It all but works fine until you have .something.something
tld., like .co.uk
or .co.in
.
So my output is:
apple.com
apple.com
apple.com
--> co.uk
Where it should have been,
apple.com
apple.com
apple.com
bbc.co.uk
How do I fix/create the function in an elegant way instead of checking for all possible double tlds? Edit: I will have to check millions of domains, if that matters. So what I would do is to pass a domain to my function and get a clean, subdomain free domain.