49

I know I can validate against string with words ( 0-9 A-Z a-z and underscore ) by applying W in regex like this:

function isValid(str) { return /^\w+$/.test(str); }

But how do I check whether the string contains ASCII characters only? ( I think I'm close, but what did I miss? )

Reference: https://stackoverflow.com/a/8253200/188331

UPDATE : Standard character set is enough for my case.

Community
  • 1
  • 1
Raptor
  • 53,206
  • 45
  • 230
  • 366
  • What's wrong with current solution? – zerkms Jan 14 '13 at 04:00
  • Standard or extended character set? – zzzzBov Jan 14 '13 at 04:01
  • I want ASCII symbols such as parenthesis , hyphen , question marks , fullstop to be included. – Raptor Jan 14 '13 at 04:01
  • standard character set is enough for this case. – Raptor Jan 14 '13 at 04:02
  • possible duplicate of [Regex any ascii character](http://stackoverflow.com/questions/3203190/regex-any-ascii-character) – sachleen Jan 14 '13 at 04:02
  • I will disagre with @sachleen if only because the linked question does not specify the language that the regex is implemented in. It can make a big difference. For example, [the feature described in this answer is not valid in JavaScript](http://stackoverflow.com/a/3203258/497418). – zzzzBov Jan 14 '13 at 21:39
  • @zzzzBov and there are other answers that do work in JS. – sachleen Jan 15 '13 at 00:30
  • @sachleen, while that may be true, generally a question is considered different when it deals with different languages. For example, a question answering what the `+=` operator does in JavaScript is not considered a duplicate of one that answers what the `+=` operator does in C#, even though the same answer may be applicable. – zzzzBov Jan 15 '13 at 01:46

4 Answers4

119

All you need to do it test that the characters are in the right character range.

function isASCII(str) {
    return /^[\x00-\x7F]*$/.test(str);
}

Or if you want to possibly use the extended ASCII character set:

function isASCII(str, extended) {
    return (extended ? /^[\x00-\xFF]*$/ : /^[\x00-\x7F]*$/).test(str);
}
zzzzBov
  • 174,988
  • 54
  • 320
  • 367
  • 1
    @zerkms, between the two of us we'll get there. For now I'll assume `*` as a string of `''` is technically ASCII. – zzzzBov Jan 14 '13 at 04:05
  • 1
    I'm not insisting on `+` instead of `*`. When I posted my comment there were none of them. – zerkms Jan 14 '13 at 04:07
  • The extended option does not seem to cover the following characters: ["€", "‚", "ƒ", "„", "…", "†", "‡", "ˆ", "‰", "Š", "‹", "Œ", "Ž", "‘", "’", "“", "”", "•", "–", "—", "˜", "™", "š", "›", "œ", "ž"] All apart of the extended ASCII character set. As listed here: http://web.itu.edu.tr/~sgunduz/courses/mikroisl/ascii.html – SimonHawesome Aug 15 '16 at 20:51
  • 1
    @SimonHawesome, your source is bad. The [wikipedia entry](https://en.wikipedia.org/wiki/Extended_ASCII) has a reference table in an image at least. – zzzzBov Aug 15 '16 at 21:01
  • 1
    For the true one liner: `const isASCII = string => /^[\x00-\x7F]*$/.test(string);`. – Константин Ван Jan 13 '17 at 11:55
  • 7
    For **printable** ASCII: `const isPrintableASCII = string => /^[\x20-\x7F]*$/.test(string);`. – Константин Ван Jan 13 '17 at 11:59
  • 4
    @K_'s regex allows the DEL character; use `/^[\x20-\x7E]*$/` to disallow it. – alxndr Oct 27 '17 at 00:29
  • why is it returning true for `isPrintableASCII("0x2")`? – ClementWalter Oct 12 '22 at 14:22
  • 1
    @ClementWalter, `0`, `x`, and `2` are printable ascii characters. You didn't use a Unicode escape sequence. – zzzzBov Oct 12 '22 at 15:49
12

You don't need a RegEx to do it, just check if all characters in that string have a char code between 0 and 127:

function isValid(str){
    if(typeof(str)!=='string'){
        return false;
    }
    for(var i=0;i<str.length;i++){
        if(str.charCodeAt(i)>127){
            return false;
        }
    }
    return true;
}
Danilo Valente
  • 11,270
  • 8
  • 53
  • 67
  • 10
    "You don't need a RegEx to do it" --- why not use regex - it will be a trivial one liner – zerkms Jan 14 '13 at 04:04
  • If he wants "ASCII only" I guess it's >127, not >255 – ThiefMaster Jan 14 '13 at 04:04
  • 4
    RegExp is also significantly faster than this version. This is like 5x slower. – Mike Frysinger Aug 03 '17 at 03:11
  • For any kind of user input < 1k chars (will differ per machine) or so it will not matter if you use a `RegExp` vs this method. It can be done slightly cleaner if you strip the `typeof` check and simply `str.split('').map(c => c.charCodeAt(0)).every(c => c <= 127);` Trying to read the code later and decode the regex will be more difficult than understanding the `split(...).map(...)` example, especially when using escapes such as `\x[HEX_VALUE]`. -- Of course, this will not apply when performance **really** matters. – SidOfc May 17 '19 at 14:52
9

For ES2018, Regexp support Unicode property escapes, you can use /[\p{ASCII}]+/u to match the ASCII characters. It's much clear now.

Supported browsers:

  • Chrome 64+
  • Safari/JavaScriptCore beginning in Safari Technology Preview 42
Kevin Yue
  • 312
  • 2
  • 9
0
var check = function(checkString) {

    var invalidCharsFound = false;

    for (var i = 0; i < checkString.length; i++) {
        var charValue = checkString.charCodeAt(i);

        /**
         * do not accept characters over 127
         **/

        if (charValue > 127) {
            invalidCharsFound = true;
            break;
        }
    }

    return invalidCharsFound;
};
StarPlayrX
  • 33
  • 3