You kind of mistook the words here... A TLD (Top Level Domain) refers to the last segment of a domain name or the part that follows immediately after the "dot" symbol. (E.g.: .com
, .net
, etc..)
What you're searching for is the second level domain (or SLD).
I've edited Daveo's answer for your question, so the match will be returned to the first capture group:
(?:[-a-zA-Z0-9@:%_\+~.#=]{2,256}\.)?([-a-zA-Z0-9@:%_\+~#=]*)\.[a-z]{2,6}\b(?:[-a-zA-Z0-9@:%_\+.~#?&\/\/=]*)
Here is a demo: https://regex101.com/r/x2luiO/1
Explanation:
(?:[-a-zA-Z0-9@:%_\+~.#=]{2,256}\.)?
- This first part will get everything before your SLD (subdomains).
([-a-zA-Z0-9@:%_\+~#=]*)
- This is your capturing group (Where the domain should be returned)
\.[a-z]{2,6}
- This will match the TLD (if you also want to capture)
\b(?:[-a-zA-Z0-9@:%_\+.~#?&\/\/=]*)
- And this is the rest of the regex, that should match the port and/or the rest of the URL (/example/page/
).
It's also good to point that this regex will not match if you're testing a domain with the SLD and ccTLD (Country Code TLD) 'combo', example: .co.uk
and .co.it
, both are just the end of a domain for commercial and general websites, however, both will return co
as the SLD.