0

I'm currently struggling with finding a way to extract domain names in urls.

My strings

xyz.weam.com 
we2.wal.com 
abc.workwork.google.net

I would like it to look for (com|org|net) and take the string before the match including the match until it hits the first (.) going backwards.

I have tried different combinations of lookbehind and positive lookahead but I was never able to make it stop at the right dot (.).

  • 2
    What programming language are you using? – MonkeyZeus Dec 13 '19 at 15:42
  • 2
    If positive lookbehinds are available then this would work `(?<=\.)[^.]+?(?=\.(?:com|org|net)$)` – MonkeyZeus Dec 13 '19 at 15:46
  • From the [regex tag info](https://stackoverflow.com/tags/regex/info): "Since regular expressions are not fully standardized, all questions with this tag should also include a tag specifying the applicable programming language or tool." – Toto Dec 13 '19 at 15:52
  • Thanks, it used JavaScript and your query didn't work. – MoAlshahrani Dec 13 '19 at 15:54
  • 1
    It's usually just as simple and more performant to use capturing groups instead of lookarounds, e.g. `\.([^.]+).(?:com|net|org)$` and extract your target value from the first capturing group (`"abc.workwork.google.net".match(/\.([^.]+).(?:com|net|org)$/)[1]`) – Aaron Dec 13 '19 at 15:55
  • Thanks Aaron, that did the trick. I'm wondering though if it is possible to make it generic to capture any TLD that could be thrown there rather than specifying them in the query? – MoAlshahrani Dec 13 '19 at 16:19

1 Answers1

0

Thanks for the answers guys and especially to Aaron, his answer worked perfectly.

His Regex did the trick.

\.([^.]+).(?:com|net|org)$