4

I am using javascript regex to do some data validation and specify the characters that i want to accept (I want to accept any alphanumeric characters, spaces and the following !&,'\- and maybe a few more that I'll add later if needed). My code is:

var value = userInput;
var pattern = /[^A-z0-9 "!&,'\-]/;
if(patt.test(value) == true) then do something

It works fine and excludes the letters that I don't want the user to enter except the square bracket and the caret symbols. From all the javascript regex tutorials that i have read they are special characters - the brackets meaning any character between them and the caret in this instance meaning any character not in between the square brackets. I have searched here and on google for an explanation as to why these characters are also accepted but can't find an explanation.

So can anyone help, why does my input accept the square brackets and the caret?

Jerry
  • 70,495
  • 13
  • 100
  • 144
David
  • 65
  • 1
  • 7

4 Answers4

7

The reason is that you are using A-z rather than A-Za-z. The ascii range between Z (0x5a) and a (0x61) includes the square brackets, the caret, backquote, and underscore.

fred02138
  • 3,323
  • 1
  • 14
  • 17
3

Your regex is not in line with what you said:

I want to accept any alphanumeric characters, spaces and the following !&,'\- and maybe a few more that I'll add later if needed

If you want to accept only those characters, you need to remove the caret:

var pattern = /^[A-Za-z0-9 "!&,'\\-]+$/;

Notes:

  1. A-z also includesthe characters:

    [\]^_`
    .

    Use A-Za-z or use the i modifier to match only alphabets:

     var pattern = /^[a-z0-9 "!&,'\\-]+$/i;
    
  2. \- is only the character -, because the backslash will act as special character for escaping. Use \\ to allow a backslash.

  3. ^ and $ are anchors, used to match the beginning and end of the string. This ensures that the whole string is matched against the regex.

  4. + is used after the character class to match more than one character.


If you mean that you want to match characters other than the ones you accept and are using this to prevent the user from entering 'forbidden' characters, then the first note above describes your issue. Use A-Za-z instead of A-z (the second note is also relevant).

Community
  • 1
  • 1
Jerry
  • 70,495
  • 13
  • 100
  • 144
  • I don't think the + is necessary (nor the anchors) since he's utilizing the test method, which just checks to see if there's a match at all. – Alex Sep 06 '13 at 16:06
  • @Alex Yes, I only realised that later on, and that's why I had added the second part in my answer. I left the first part which does the same thing except with the `.match()` method. – Jerry Sep 06 '13 at 16:42
0

I'm not sure what you want but I don't think your current regexp does what you think it does:

It tries to find one character is not A-z0-9 "!&,'\- (^ means not).

Also, I'm not even sure what A-z matches. It's either a-z or A-Z.

So your current regexp matches strings like "." and "Hi." but not "Hi"

Halcyon
  • 57,230
  • 10
  • 89
  • 128
0

Try this: var pattern = /[^\w"!&,'\\-]/;

Note: \w also includes _, so if you want to avoid that then try

var pattern = /[^a-z0-9"!&,'\\-]/i;

I think the issue with your regex is that A-z is being understood as all characters between 0x41 (65) and 0x7A (122), which included the characters []^_` that are between A-Z and a-z. (Z is 0x5A (90) and a is 0x61 (97), which means the preceding characters take up 0x5B thru 0x60).

Alex
  • 1,979
  • 16
  • 24