0

I am trying to get the extension (dk, com, org, eu) or any other domain extension from a String.

for example:

http://www.example.com/siteone/sitetwo/currentpage

From this String i would like to get the .com

I could go the very messy way around and do subString however the problem comes when an url looks like this:

dk.webpage.otherstuff.com/page

So how will i go around this in a way that doesnt require me to check everything every step of the way

Marc Rasmussen
  • 19,771
  • 79
  • 203
  • 364

4 Answers4

1

Use the getHost() method like this:

public static String getDomainName(String testUrl) throws URISyntaxException {
    URI fullUri = new URI(testUrl);
    String domainName = fullUri.getHost();
    return domainName.startsWith("www.") ? domainName.substring(4) : domainName;
}

After you have done that then just use subString for the .com part of your domain name.

liquidsnake786
  • 449
  • 2
  • 8
  • if you are directly use "www." and subString(4) won't work because now a days there are some URL which starts like www3.xyz.com than your code will fail. – emphywork Nov 25 '13 at 12:05
  • The OP can use a regex on the domainName then, depends on how detailed you want to go, with the introduction of custom domains this whole practical would be near impossible to do anyway. – liquidsnake786 Nov 25 '13 at 12:08
  • @liquidsnake786 with sites like http://www.homesick.nu\ i get an error do you know why? (the error is invalid character at index 7 (meaning that \ is wrong but how come? ) – Marc Rasmussen Nov 25 '13 at 12:20
1

Try this:

String ext = url.replaceAll(".*//[^/]*(\\.\\w+)/.*", "$1");

Some test code:

String url = "http://www.example.com/siteone/sitetwo/currentpage";
String ext = url.replaceAll(".*//[^/]*(\\.\\w+)/.*", "$1");
System.out.println(ext);

Output:

.com
Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • @RandomGuy $1 means "captured group 1", which is the bracketed pattern to capture the dot and word chars of the extension – Bohemian Nov 25 '13 at 12:28
1

Use Guava's InternetDomainName class. Specifically have a look at the publicSuffix method.

Kai Sternad
  • 22,214
  • 7
  • 47
  • 42
0

Try this :

private String getExtensionFromDomain(String domainName){ int p = domainName.lastIndexOf(".") +1; return domainName.substring(p); }

In case of example.co.ma this will output : .ma

Mehdi
  • 1,340
  • 15
  • 23