0

I have the REGEX below which I am expecting to exclude certain characters. These characters are correctly excluded: £"~#¬|{} but these aren't: @[]/?;:

So, for example, test£test is correctly identified as invalid, but test@test is incorrectly identified as valid.

Testing this on https://regex101.com/ identifies the problem as the brackets and indicates that I need to escape the first ( [bracket] and the - [hyphen] like this - ^[a-zA-z0-9!$%^&*\()\-_=+]+?$. On https://regex101.com the expression then behaves as expected but if I try to use escape characters like this in Java the compiler gives an error.

Any ideas how I can get this regular expression to behave as I want? Sorry if this is obvious.

         final String REGEX = "^[a-zA-z0-9!$%^&*()-_=+]+?$";
         System.out.println ("Please enter a password");
         String password = input.next();
         Pattern p = Pattern.compile(REGEX);
         Matcher m = p.matcher(password);
         if (!m.matches()){
            System.out.println("Illegal characters");

3 Answers3

2

Brief

^[a-zA-z0-9!$%^&*()-_=+]+?$
     ^^^          ^^^

The first underlined range is A-z. This matches:

ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz

The second underlined range corresponds to

)*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_

ASCII Table


Code

See regex in use here: Note the regex is only the set for the first example below. This is to show which characters it's actually matching.

Use either of the following

^[a-zA-Z0-9!$%^&*()\-_=+]+?$
^[a-zA-Z0-9!$%^&*()_=+-]+?$
^[\w!$%^&*()=+-]+?$
^[\w!$%&^(-+=-]+?$
ctwheels
  • 21,901
  • 9
  • 42
  • 77
  • you're correct, i have upvoted – AdamPillingTech Dec 21 '17 at 17:24
  • @PillHead thanks. I added a regex101 link to display which characters it accepts. – ctwheels Dec 21 '17 at 17:25
  • Really useful explanation of how the ranges work - thank you. I think in the first one Java requires a double escape (I get a compiler error with just a single escape). The second one works fine. – FullTiltBoogie Dec 21 '17 at 17:26
  • @FullTiltBoogie yes the first one (used in Java) requires a second backslash to escape it: `^[a-zA-Z0-9!$%^&*()\\-_=+]+?$`, the other ones don't require this. The last regex is the shortest way to include all those characters, but most difficult to understand. You can also use `\w` instead of `[a-zA-Z0-9_]` – ctwheels Dec 21 '17 at 17:27
  • OK - that last one is great. I've come across the \w character and now that I understand the ranges I can see how it works. Sorry I can't up-vote - I'm too green! – FullTiltBoogie Dec 21 '17 at 17:32
  • @FullTiltBoogie no problem! Regex101 is a very powerful tool for learning regular expressions. Play around with it and whenever you're using a range cross-reference it with the ASCII table to ensure you're not matching anything you don't want to match. – ctwheels Dec 21 '17 at 17:34
-1

The issue with your regex is that it contains special characters which require escaping.

All the characters referenced in this page will require escaping if they are valid in your password.

Pattern docs

Therefore you should use a regex something like the following. I have not thoroughly tested this, however, so please write some thorough unit tests to cover all legitimate possibilities.

"^[a-zA-z0-9!\\$%\\^&\\*\\(\\)\\-_\\=\\+]+?$"
AdamPillingTech
  • 456
  • 2
  • 7
-1

Sorry - I now realise that I have a number of meta-characters which all need escaping. The following REGEX behaves as expected, with double backslashes to escape each meta character:

final String REGEX = "^[a-zA-z0-9\\!\\$%\\^&*\\(\\)\\-_\\=\\+]+?$";

If there is a more elegant way I'd love to hear it!