I am trying to build a regex that disallows certain characters in a String
in an Android application. That is, if any of the characters are not in a set of allowed characters, I should be able to know. My allowed characters are:
"a-zA-Z0-9æøåÆØÅ_ -"
The user types in a name which I want to check. My current take on this is:
filename.matches("^((?![a-zA-Z0-9æøåÆØÅ_ -]).)*$");
based on this answer. This regex returns false
for all input, except if all characters are disallowed, which is not what I want. I also tried a simpler regex
filename.matches("([^a-zA-Z0-9æøåÆØÅ_ -])");
to try to match anything not in the capturing group, but this did not work as intended either.
What am I missing? Are there any quirks or special things in the Java regex engine in this particular case?
Examples
None of the regexes provided gives the desired result. Consider these examples. When the string contains both accepted and unaccepted characters, it fails to produce the proper result. The result is the same in Python. When pasting the two regexes below into https://regex101.com/, however, the latter seems to work as expected. It does not in reality though. I also tried adding capturing groups (i.e. parantheses) to the regexes, but to no avail.
String foo1 = "this_is_a_filename";
String foo2 = "this%is%not%a%filename";
String foo3 = "%+!?";
String regex1 = "^[^a-zA-Z0-9æøåÆØÅ_ -]+$";
String regex2 = "[^a-zA-Z0-9æøåÆØÅ_ -]+";
boolean isMatch;
isMatch = foo1.matches(regex1); // false, ok
isMatch = foo2.matches(regex1); // false, should be true
isMatch = foo3.matches(regex1); // true, ok
isMatch = foo1.matches(regex2); // false, ok
isMatch = foo2.matches(regex2); // false, should be true
isMatch = foo3.matches(regex2); // true, ok