0

If we have an url e.g www.google.de how can I get ONLY the "google"

In Java new URL (url).getHost(); does work but it gives me google.de and this is not what I want to have.

Thank you

EDIT: If we have something like www.google.co.uk then I also want to have only "google" as result.

I dont want "google.de" or "www.google" I ONLY want "google"

slim
  • 40,215
  • 13
  • 94
  • 127
Blnpwr
  • 1,793
  • 4
  • 22
  • 43
  • 1
    That's not called a hostname. – SLaks Jul 11 '17 at 15:23
  • What do you want to get from `www.google.co.uk`? – SLaks Jul 11 '17 at 15:23
  • @SLaks when we have `www,google.co.uk` I also want only "google" – Blnpwr Jul 11 '17 at 15:24
  • you can use [StringTokenizer](https://docs.oracle.com/javase/7/docs/api/java/util/StringTokenizer.html) – Blasanka Jul 11 '17 at 15:24
  • 2
    Possible duplicate of [Get domain name from given url](https://stackoverflow.com/questions/9607903/get-domain-name-from-given-url) – John Scattergood Jul 11 '17 at 15:26
  • No, it is not a duplicate. In this thread I want to have only "google", in your posted link, one wants "google.de" , it is a different problem here. – Blnpwr Jul 11 '17 at 15:29
  • 3
    You're going to have to create your own rules. On what basis are you choosing "google" from "www.google.co.uk"? Second element? First element but ignoring "www" as a special case? What other special cases do you want to ignore? It's your requirement - you have to define it. – slim Jul 11 '17 at 15:30
  • So update the question, saying different from that question – Blasanka Jul 11 '17 at 15:30
  • What do you want to get from `photos.google.co.uk` vs. `photos.google.de`? You need to precisely define your requirements, at which point you can then translate that into code. – SLaks Jul 11 '17 at 15:34
  • 2
    The part you want to ignore, i.e. the www actually _is_ the hostname. – Björn Zurmaar Jul 11 '17 at 15:36
  • Also "www.google.co.uk" is not a URL. It's a domain name. URLs start with a scheme. – slim Jul 11 '17 at 15:46
  • 3
    Stepping back, I think this is an XY problem. https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem -- why do you want to do this? – slim Jul 11 '17 at 15:47

2 Answers2

1

Splitting on a period and selecting the first or second element (whichever is not "www") would work:

URL url = new URL("http://www.host.ext.ext");
String host = url.getHost(); // host = "www.host.ext.ext"
String splitHost = host.split("\\.") // splitHost = { "www", "host", "ext", "ext" }

host = splitHost[0].equals("www") ? splitHost[1] : splitHost[0]; // host = "host"

If there is anything more than http://www. before it, and the extension is potentially more than two "extensions" (.co.uk for instance), then there is no easy way to get just the part you want. As far as I know, you would have to try iterating over a list of extensions and return the part immediately before the longest matching extension.

17slim
  • 1,233
  • 1
  • 16
  • 21
  • 2
    This [works](https://glot.io/snippets/ermp6i0mvo) but I agree with the comment above that this is likely an [XY Problem](https://meta.stackexchange.com/a/66378/160897) and there is missing information why "google" is more useful than "google.com". – styfle Jul 11 '17 at 15:51
  • Yeah, I think I agree as well now. – 17slim Jul 11 '17 at 16:05
  • 1
    While it is possible this is an XY, I think it could also be exactly what they're asking. Perhaps they actually want to display the important part, ignoring the extension? Although that might be problem X now that I think about it; "how do I display a website's display name/title?" – 17slim Jul 11 '17 at 16:11
  • Also, could anyone explain the downvote? The answer does exactly what the asker asked for, and explains the one possible pitfall. Help us out here. – 17slim Jul 11 '17 at 16:16
0

The most basic solution would be using

 System.out.println(url.split("\\.")[1]);

Or you could try this https://stackoverflow.com/a/23079402/2555419

public String getHostName(String url) {
    URI uri = new URI(url);
    String hostname = uri.getHost();
    // to provide faultproof result, check if not null then return only hostname, without www.
    if (hostname != null) {
        return hostname.startsWith("www.") ? hostname.substring(4) : hostname;
    }
    return hostname;
}
Joey Pinto
  • 1,735
  • 1
  • 18
  • 34
  • He needs to specify his question, but he probably actually wants a ccTLD list. – SLaks Jul 11 '17 at 15:35
  • Yeah probably, I just gave him the most basic solution so he can have somewhere to start from if he wants to use String manipulation. – Joey Pinto Jul 11 '17 at 15:36