2

NB. I only want to know if it's a valid application of unescaped hyphen in the regex definition. It's not a question about matching email, meaning of hyphen nor backslash, quantifiers or anything else. Also, please note that the linked in answer doesn't really discuss the validity issue between escaped/unescaped hyphen.

Usually I declare the regex for matching email addresses like this.

var emailPattern = /^[a-z.\-_]+@[a-z]+[.]{1}[a-z]{2,3}$/;
emailPattern.test('ss.a_a-@ass.com');

Now, by mistake, a colleague of mine forgot the escape character and **still* made it work, which surprised me, because of the interval meaning of the hyphen. It looks like this.

var weirdPattern = /^[a-z._-]+@[a-z]+[.]{1}[a-z]{2,3}$/;
weirdPattern.test('ss.a_a-@ass.com');

Apparently, it works because the hyphen is the last character in the brackets. My question is if this is just a happy coincidence or if it's a valid syntax? Have I been regexing wrong my whole life?

Konrad Viltersten
  • 36,151
  • 76
  • 250
  • 438

1 Answers1

4

Hyphens inside character class are used for range. However, when put at the beginning or at the end inside character class there is no need of escaping that.

Note that, in some browsers, hyphens at any position in the character class are still considered as range metacharacters, so it is best practice to always escape it.

Quoting from regular-expressions.info

The hyphen can be included right after the opening bracket, or right before the closing bracket, or right after the negating caret. Both [-x] and [x-] match an x or a hyphen. [^-x] and [^x-] match any character that is not an x or a hyphen. Hyphens at other positions in character classes where they can't form a range may be interpreted as literals or as errors. Regex flavors are quite inconsistent about this.

Tushar
  • 85,780
  • 21
  • 159
  • 179
  • Also, you could include what `[#--0]` would match. – Kenny Lau Apr 29 '16 at 13:38
  • 1
    I understand what you say and can confirm that it works indeed. I can also imagine that it sometimes breaks and should be escaped for safety's sake. The question is, however, if it's a valid syntax. Definitionwise and academicly speaking. – Konrad Viltersten Apr 29 '16 at 13:39
  • @KonradViltersten That depends on how the regex engine has implemented it. For widely used browsers it works. – Tushar Apr 29 '16 at 13:42
  • I have never heard of a browser that requires to escape a hyphen at the beginning of the character class to be treated as a literal. Could you name one? Konrad, **it is a totally valid syntax** that has its stems in the POSIX standard where characters inside a character class cannot be escaped. – Wiktor Stribiżew Apr 29 '16 at 13:45
  • @WiktorStribiżew I don't remember correctly but I've seen the same on a post on SO and the browser name(_again not sure_) is Komodo – Tushar Apr 29 '16 at 13:46
  • @WiktorStribiżew I've made even more explicit disclaimer. Feel free to review it. I fear you were a bit too trigger happy marking it as a dupe. I can't see the actual answer to **my** question there. Those are more of *how-to* not *why-and-why-not*, in my opinion. – Konrad Viltersten Apr 29 '16 at 13:48
  • @Tushar: This? [*Komodo can accept Python syntax regular expressions in it's various Search features.*](http://docs.activestate.com/komodo/4.4/regex-intro.html) - there is a really weird thing about the Python regex that does not allow unescaped hyphen after shorthand character classes (do not use `[\w-,]` in Python). But you do not have to escape `-` at the start/end of a character class. – Wiktor Stribiżew Apr 29 '16 at 13:48
  • @WiktorStribiżew I remember that I've seen it in SO post, but not able to search now – Tushar Apr 29 '16 at 13:50
  • From what I know, there is only one single engine requiring to escape the hyphen at the end of the character class - Elasticsearch. All other flavors allow you to use unescaped literal hyphens both at the beginning/end. – Wiktor Stribiżew Apr 29 '16 at 13:55
  • @Tushar You forgot to mention that it will also match `everything` between `#` and `-`. [DEMO](https://regex101.com/r/lI2oL5/1). – Kenny Lau Apr 29 '16 at 14:00
  • @WiktorStribiżew I've made it community wiki, can you please edit it to add more details. – Tushar Apr 29 '16 at 14:02
  • You did not have to, and you can ask moderators to revert it to a normal answer :) If I have time, I will add. – Wiktor Stribiżew Apr 29 '16 at 14:06