I'm looking for a regex that can extract all subdomains + domain from an URL.
I already found this one from here:
/([a-z0-9|-]+\.)*[a-z0-9|-]+\.[a-z]+/
It's capable of extracting subdomain + domain but unfortunately doesn't care about -
in front of a subdomain/domain and also doesn't support non-ASCII characters like specified in RFC 3490
Here are some examples I want to catch:
http://www.例如.中国/
http://www.würstchen.mit.käsebrötchen.de:8080/news/index.html
https://www.fußballspiel.de/
http://www.simulateur-prêt.fr