How do I match Unicode special alpha characters while NOT matching special characters

Question

I have a dilemma here. I am trying to write a regex pattern that matches all alpha characters for eastern languages as well as western languages. One of the criteria is that no numbers can match (so José13) is not a match but (José) is, the other criteria is that special characters cannot match (ie: !@#$% etc.)

I've played around with this in chrome's console, and I've gotten:

"a".match('[a-zA-z]');

to come back successfully, when I put in:

"a".match('[\p{L}]');

I get a null response, which I'm not quite understanding why. According to http://www.regular-expressions.info/unicode.html \p{L} is a match for any letter.

EDIT: the \p doesn't seem to work in my chrome console, so I'll try a different route. I have a chart of the unicode from Unifoundry. I'll match up the regex and attempt to make the range of characters invalid.

Any input would be greatly appreciated.

What do you mean by “alpha characters” and “eastern languages”? The approach won’t work, as @icchthedral remarks, so you need to define what exactly you want to include. — Jukka K. Korpela, Dec 09 '13 at 15:01
I mean all non-numerical, non-mathematical, non-punctuation characters for all languages. I don't want someone to be able to enter ResourceReaper, but not ResourceReaper# OR RésourceRéaper but not RésourceRéaper12 or RésourceRéaper#. — ResourceReaper, Dec 09 '13 at 15:18

score 1 · Accepted Answer · edited May 23 '17 at 11:49

1

This works in the javascript console, but it seems like a hack:

.match('^[^\u0000-\u0040\u005B-\u0060\u007B-\u00BF\u00D7\u00F7]*');

However it does what I need it to do.

Referenced this post on SO: Javascript + Unicode regexes

edited May 23 '17 at 11:49

Community

1
1

answered Dec 09 '13 at 17:00

ResourceReaper

555
2
10
27

Unfortunately, with JavaScript's weak Regex and string libraries, this really is the best you can do short of seeing if someone's written more robust libraries. – Jacob Dec 09 '13 at 17:25

score 0 · Answer 2 · answered Dec 09 '13 at 14:49

0

Current Javascript implementations don't support such shortcuts, but you can specify a range, for example:

/[\u4E00-\u9FFF]+/g.test("漢字")

answered Dec 09 '13 at 14:49

nullpotent

9,162
1
31
42

How do I match Unicode special alpha characters while NOT matching special characters

2 Answers2