RegExp Validate SMS Text

Question

How do i write a RegExp to validate SMS Text is only keyboard character (abc, ABC, 123, ~!@#$%^&*()`[]{}|;':',./<>?)

Thanks...

Any luck with comments and answers? – M'vy Mar 08 '11 at 22:28 — M'vy, Mar 08 '11 at 22:28

score 10 · Accepted Answer · answered Mar 08 '11 at 18:32

The default GSM character set is defined in GSM 03.38. Assuming you're looking at decoded text, not the 7bit packed format that is really used, a regex like the following should limit you to the allowable characters

"@£$¥èéùìòÇ\fØø\nÅåΔ_ΦΓΛΩΠΨΣΘΞÆæßÉ !\"#¤%&'()*+,-./[0-9]:;<=>\?¡[A-Z]ÄÖÑÜ§¿[a-z]äöñüà\^\{\}\[~\]\|€"

Note though that it is possible to sent unicode UCS-2 messages, at which point the handset receiving the message has to have suitable glyphs for presentation to the user, the unicode itself is not a limiting factor.

Worked like a charm, although I had to add [ and ] at start and end and test it pr character. — runholen, Oct 14 '16 at 11:04

score 4 · Answer 2 · answered Mar 03 '11 at 16:36

I propose to do it manually.

You just have to take care of some exceptions like the [ ] (need escaping) the backquote and the quote depending on the language you are writing with (since it coud end the string of the pattern)

^[a-zA-Z0-9~!@#$%^&*()`\[\]{};':,./<>?| ]*$

Maybe it would require a little tuning. I'm pretty sure that - and _ are accepted in SMS texts.

score 1 · Answer 3 · edited Sep 24 '15 at 09:17

1

I searched a lot but, I think best one is.

function CharecterControl(input) {
    var str = /[^A-Za-z0-9 \\r\\n@£$¥èéùìòÇØøÅå\u0394_\u03A6\u0393\u0027\u0022\u039B\u03A9\u03A0\u03A8\u03A3\u0398\u039EÆæßÉ!\#$%&amp;()*+,\\./\-:;&lt;=&gt;?¡ÄÖÑÜ§¿äöñüà^{}\\\\\\[~\\]|\u20AC]*/; 
    return !new RegExp(str).test(input);       
}

edited Sep 24 '15 at 09:17

zVictor

3,610
3
41
56

answered Jul 19 '13 at 14:13

KnowGe

305
3
10

it seems that backslashes have been escaped too many times in this regex. Wouldn't the right one be `[^A-Za-z0-9 \r\n@£$¥èéùìòÇØøÅå\u0394_\u03A6\u0393\u0027\u0022\u039B\u03A9\u03A0\u03A8\u03A3\u0398\u039EÆæßÉ!\#$%&()*+,\./\-:;<=>?¡ÄÖÑÜ§¿äöñüà^{}\\\[~\]|\u20AC]`? – zVictor Sep 24 '15 at 09:39
@zVictor is correct - that is the proper character string. Also, this sample threw me off - a better function name is `isGsmEncoded` - or for the reverse (if you want to test for non-GSM characters instead, just remove the `!`), it could then be called `hasUcs2Characters`. – qJake Mar 18 '20 at 20:35

score 1 · Answer 4 · answered Sep 06 '13 at 18:44

I know that I'm a little late to the party, but I've been fighting with this. I recently ran across Twitter's Open Source Project:

https://github.com/twitter/cloudhopper-commons-charset

It provides a great way of cleaning Strings before sending them based on charsets. It also supports encoding a string as bytes based on a SMS friendly charset. Here is my example cleaning an existing string before sending through SMS using their libraries:

public static String cleanSMS(String msg) {
    Charset charset = CharsetUtil.map(CharsetUtil.NAME_GSM7);
    StringBuilder cleaned  = new StringBuilder(msg);
    log.info("Accent chars replaced: " + MobileTextUtil.replaceAccentedChars(cleaned));
    log.info("Safe chars replaced: " + MobileTextUtil.replaceSafeUnicodeChars(cleaned));
    return CharsetUtil.normalize(cleaned.toString(), charset);
}

RegExp Validate SMS Text

4 Answers4