I'm trying to write a Java RegEx that will extract the domain name from a list of domains, sub-domains, and multi sub-domains.
There are too many domains to maintain with the RegEx I have written, and there is a lot more out there. https://publicsuffix.org/list/effective_tld_names.dat
What is a better way to capture the domain name? The goal is to remove the subdomain, extract the domain name so I can resolve or ping it.
This is the RegEx I have come up with
(\w*.(?:\.co|\.org|\.net|\.int|\.edu|\.gov|\.mil|\.arpa|\.tv|\.aero|\.asia).*)
Here is a sample list I am testing against.
comnettest.google.com
doubleclick.net
googleapis.com
imrworldwide.com
bom.gov.au
www.bom.gov.au
googleapis.com
www.google.com
www.twiiter.com
dynamic.t2.tiles.virtualearth.net
domain.com
1-A.domain.com
1-A.2-B.domain.com
1-A.2-B.3-C.domain.com
mt0.google.com
twitch.tv
stream.twitch.tv
streamcom.com.au
network.google.com