8

In A comprehensive regex for phone number validation, the accepted answer has a number of comments. One of the comments, by @jcmcbeth, suggests the following simple regular expression to use to obtain the digits of the telephone number submitted by a user:

string.replace("[^\d+!x]", "")

Immediately following the comment with this suggested regular expression, another questioner asks why the !x part?, which is then answered in the next comment: The !x is there to keep any "x" character from getting stripped.

This makes sense to me, except for the exclamation point !. Looking at documentation for character classes in regular expressions, I do not see that the exclamation point is a special character, and it doesn't seem to me that the x requires a special character preceding it. Also, from the discussion in the linked question, I do not see any comment indicating that an exclamation point might be part of a telephone number (which would explain its inclusion in the negated character class).

Can someone please explain to me why the exclamation point is present? Thanks.

Community
  • 1
  • 1
Dan Nissenbaum
  • 13,558
  • 21
  • 105
  • 181

2 Answers2

7

You're absolutely right, the x is sufficient. ! just matches a literal !, inside a character class or out. The only place it has any special meaning is when it's part of negative lookahead or a negative lookbehind.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
-3

Surely [^\d+!x] means "NOT '\', NOT 'd', NOT '+', NOT '!' and NOT 'x'".

[ ] is a character group.

[^ ] negates the list of characters.

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
g1smd
  • 43
  • 2
  • 2
    Not quite. `\d` is the shorthand for digits; it works the same in a character class as it does anywhere else. If you really wanted it to match a backslash or a `d` you would have to escape the backslash: `[^\\d+!x]`. But that's not what the author intended. – Alan Moore Jul 07 '12 at 03:45
  • Depends on the dialect. Before Perl, `[^\d]` would most certainly mean any character except backslash or d (or newline). – tripleee Jul 07 '12 at 06:46