-1

I have a different domains for example like:

https://www.google.com
https://www.google.de
https://www.google.co.uk
https://www.google.com/randompath
https://www.google.de/randompath
https://www.google.co.uk/randompath

I need to extract only the Top-Level Domain for every possible link. For this example it would be .com .de .co.uk ....

The regular expression I tried:

/\.[^.]{2,3}(?:\.[^.]{2,3})?$/

Only works, when there is no more path after the top-level domain. Does someone have a solution?

isherwood
  • 58,414
  • 16
  • 114
  • 157
Poison
  • 1

1 Answers1

1

Use the URL

This one does not get the second level domain which is not trivial to get right without a lookup table

var domains =`https://www.google.com
https://www.google.de
https://www.google.co.uk
https://www.google.com/randompath
https://www.google.de/randompath
https://www.google.co.uk/randompath`
  .split(/\n/)
  .map(href => new URL(href).hostname.split(".").pop())

console.log(domains)

This one gets the last or last two depending on length

var domains = `https://www.google.com
https://www.google.de
https://www.google.co.uk
https://foo.bar.my.subdomain.example.co.uk 
https://www.google.com/randompath
https://www.google.de/randompath
https://www.google.co.uk/randompath`
  .split(/\n/)
  .map(href => {
    let hostnameParts = new URL(href).hostname.split(".");
    let domain = hostnameParts.slice(hostnameParts.length > 3 ? -2 : -1)
    return domain.join(".");
  })

console.log(domains)
mplungjan
  • 169,008
  • 28
  • 173
  • 236
  • Thanks. But it should display with the dot at the beginning and when it is a domain like .co.uk and .com.au -> .co.uk and .com.au instead of uk and au – Poison Sep 26 '19 at 17:52
  • I know. But what you had would not give that either. My solution is a better start. You can interrogate the second level for co/or/ and others but see my link to Wikipedia why it is likely not trivial at all – mplungjan Sep 26 '19 at 17:59
  • And then what happens when someone has many subdomains such as http://foo.bar.my.subdomain.example.co.uk/ ? As you've said, it is non-trivial. To do this effectively, the developer needs a list of all suffixes or a dependency that abstracts this requirement. – BEVR1337 Sep 26 '19 at 18:11
  • @BEVR1337 My second code handles your foo bar – mplungjan Sep 27 '19 at 17:04