2

I have a dilemma here. I am trying to write a regex pattern that matches all alpha characters for eastern languages as well as western languages. One of the criteria is that no numbers can match (so José13) is not a match but (José) is, the other criteria is that special characters cannot match (ie: !@#$% etc.)

I've played around with this in chrome's console, and I've gotten:

"a".match('[a-zA-z]');

to come back successfully, when I put in:

"a".match('[\p{L}]');

I get a null response, which I'm not quite understanding why. According to http://www.regular-expressions.info/unicode.html \p{L} is a match for any letter.

EDIT: the \p doesn't seem to work in my chrome console, so I'll try a different route. I have a chart of the unicode from Unifoundry. I'll match up the regex and attempt to make the range of characters invalid.

Any input would be greatly appreciated.

ResourceReaper
  • 555
  • 2
  • 10
  • 27
  • What do you mean by “alpha characters” and “eastern languages”? The approach won’t work, as @icchthedral remarks, so you need to define what exactly you want to include. – Jukka K. Korpela Dec 09 '13 at 15:01
  • I mean all non-numerical, non-mathematical, non-punctuation characters for all languages. I don't want someone to be able to enter ResourceReaper, but not ResourceReaper# OR RésourceRéaper but not RésourceRéaper12 or RésourceRéaper#. – ResourceReaper Dec 09 '13 at 15:18

2 Answers2

1

This works in the javascript console, but it seems like a hack:

.match('^[^\u0000-\u0040\u005B-\u0060\u007B-\u00BF\u00D7\u00F7]*');

However it does what I need it to do.

Referenced this post on SO: Javascript + Unicode regexes

Community
  • 1
  • 1
ResourceReaper
  • 555
  • 2
  • 10
  • 27
  • Unfortunately, with JavaScript's weak Regex and string libraries, this really is the best you can do short of seeing if someone's written more robust libraries. – Jacob Dec 09 '13 at 17:25
0

Current Javascript implementations don't support such shortcuts, but you can specify a range, for example:

/[\u4E00-\u9FFF]+/g.test("漢字")
nullpotent
  • 9,162
  • 1
  • 31
  • 42