1

I'm trying in regular expression to not match if '-' is at the end of a string. Here's a partial of my regex (this is looking at domain part of url, which can't have symbols at beginning or end, but can have '-' in middle of string:

(([A-Z0-9])([A-Z0-9-]){0,61}([A-Z0-9]?)[\.]){1,8}

This also has to match 1-character domains - that's why I have ? on the end character & 0,61 on the center part.

So, in short is there a regex code to prevent matching for '-' if it's at the end of the string? And if you can prevent it for beginning, then that would be great too.

Matched input: site.
Invalid input: -site. or site-.

parti
  • 205
  • 3
  • 15
  • 1
    There are so many edge cases with URLs that a regex really isn't the best way to do it. For example you can use a negated character class to get rid of hyphens but you'll need to include any character that's bad in a URL. Look behinds are slow and I think it suffers from the same problem. A quick google turned up this: https://medialize.github.io/URI.js/, there are probably others. – Cfreak Jul 05 '16 at 22:24
  • also your `{0,61}` and the second check is unnecessary ... just use `{1,61}`. (though that doesn't account for sub-domains and i think it should be `{1, 63}` – Cfreak Jul 05 '16 at 22:26
  • @Cfreak has a point: regexes are not really cut out for URLs. I tried various suggestions from [this other SO question](http://stackoverflow.com/a/3809435/3377150) and they were all far too lenient. [This page](http://code.tutsplus.com/tutorials/8-regular-expressions-you-should-know--net-6149) also suggests a regex for URLs, but it is similarly lenient. I'd go with the URI.js suggestion, OP! – Pierce Darragh Jul 06 '16 at 01:41

2 Answers2

1

in short is there a regex code to prevent matching for '-' if it's at the end of the string? And if you can prevent it for beginning, then that would be great too.

Yes you can use negative lookaheads for this:

/^(?!-|.*(\.[.-]|-\.|-$))(?:[A-Z0-9-]{0,62}\.){1,8}[A-Z0-9]{3}$/gim

RegEx Demo

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • I'm trying to implement the '-' at end but not having much luck: tried this (([A-Z0-9])([A-Z0-9-](?!.*\-$)){0,62}[\.]){1,8} Also tried wrapping new code in brackets - how do you use it in my example? – parti Jul 06 '16 at 19:16
  • Ok, I added some examples. – parti Jul 06 '16 at 19:23
  • Well it doesn't work - still need help in getting the code written properly. – parti Jul 06 '16 at 19:41
  • Ok, could you explain what the first parenthesis is all about, so I understand what it's doing. – parti Jul 06 '16 at 21:03
  • 1
    `(?!-|.*(\.-|-\.|-$))` is negative lookahead. It is denying presence of hyphen at start, end and before/after a period – anubhava Jul 06 '16 at 21:06
  • Is there a way to have only 1 dot for each subdomain? I tried putting it in brackets, but it's still allowing double dots. – parti Jul 06 '16 at 21:20
  • 1
    Wonderful, it's working great now - thanks for the help - much appreciated. – parti Jul 06 '16 at 21:54
0

Try:

^(([A-Z0-9^-])([A-Z0-9-]){0,61}([A-Z0-9]?)[\.^-]){1,8}$

I'm not 100% sure it will work with JS regexes. The idea is: ^ matches beginning of string, $ matches end, and ^- in a character class means "anything not a hyphen".

Juan Tomas
  • 4,905
  • 3
  • 14
  • 19