Regex pattern if string contains characters

Question

Sorry for being an absolute beginner, when it comes to Javascript and Regex,

I have a Codepen: http://codepen.io/anon/pen/dNJNvK

What I want to accomplish is to validate, if a String contains some special UTF-8 characters. That's why I work with RegExp. The pattern I have here will return false only if the string to test equals one of the characters. But I want to return false if it contains one of these characters.

How can I accomplish this, I know it should be quite easy, but I wasn't able to get it working.

var regEx = new RegExp('[\u0001-\u00FF]');

console.log("This should be true: " + regEx.test("Tes"));
console.log("This should be false: " + regEx.test("Tes�"));

console.log("This returns false, because the string equals a special character: " + regEx.test("�"));

Just create a `<>` snippet instead of codepen as I just did for you — mplungjan, Jan 30 '17 at 09:16
Your RegEx always returns true because 'T', 'e' and 's' are in the range you specified. Use this tool for regxs, it works great :) https://regex101.com/r/apZoYk/1 — Gabriel, Jan 30 '17 at 09:22

score 2 · Answer 1 · edited May 23 '17 at 12:33

2

Why not the other way around?

See Regular expression to match non-English characters?

Also your test could be match instead or a test of the WHOLE string

var regEx = /[\x00-\x7F]/g; // can be added to
function okChar(str) {
  var res = str.match(regEx);
  if (res===null) return false;
  return res.length===str.length;
}
console.log("This should be true: " + okChar("Tes"))
console.log("This should be false: " + okChar("Tesú"));

console.log("This returns false, because the string equals a special character: " + okChar("ú"));

edited May 23 '17 at 12:33

Community

1
1

answered Jan 30 '17 at 09:20

mplungjan

169,008
28
173
236

The problem I have is, that currently only specific characters shouldn't be allowed and in future it should be easy to add further characters. – user5417542 Jan 30 '17 at 09:28
1

@GOTO0 - fixed. – mplungjan Jan 30 '17 at 09:47

score 2 · Answer 2 · answered Jan 30 '17 at 09:49

2

as @Gabriel commented, it's returning true because there's at least one character in the string that matches your range

what you want to do is check that every character is within the range

/^[\u0001-\u00FF]+$/

or that any character is not within the range

[^\u0001-\u00FF]

in the second case you'd have true when a special character is used and false when all characters are safe, so you probably have to flip the checks you do afterward

answered Jan 30 '17 at 09:49

alebianco

2,475
18
25

Thanks. the second one works perfect for me. However I am now a bit confused why :D. I noticed the range I defined, was completely wrong. When you look at the UTF-8 Table, it's just that I wanted to leave out the characters in that range I specified. https://de.wikipedia.org/wiki/UTF-8 Normally this would have been the first to rows, so special characters. But somehow it works how I want to. When a user puts in ä, ö, and normal Strings, like "Test" everything is ok, but as soon as they enter a symbol like that question mark, it will return an error (I used !regex.test() so I get the right bool – user5417542 Jan 30 '17 at 10:06
you can use a reference table of unicode characters like https://unicode-table.com/en/ to find what you actually need. right now you're taking the first 255 characters, which include the standard alphabet, plus numbers, some standard-ish symbols and a bunch of special characters: not a very sane range in my opinion :) – alebianco Jan 30 '17 at 10:23
just for a quick reference, the basic latin set goes from \u0020 to \u007F – alebianco Jan 30 '17 at 10:26

Regex pattern if string contains characters

2 Answers2