1

I have a small JavaScript validation script that validates inputs based on Regex. I want to allow certain characters that are not exactly common (not sure if they're UTF8). For example I want to allow the following character , which looks like a single quote, but isn't.

I got the HTML code for this which is ’, but I'm not sure how to put this into the Regex.

I've tried just inputting [&#8217]* but it doesn't validate.

Eduard Luca
  • 6,514
  • 16
  • 85
  • 137

3 Answers3

2

How about

/[\u2019]/

It uses the actual character rather than the html entity. 2019 is hex for 821710

http://jsfiddle.net/eV2ek/

Musa
  • 96,336
  • 17
  • 118
  • 137
1

As long as you properly tag encoding of your JavaScript (or its holding page if it is inline) either through charset attribute or Content-Type header, you can just use any character that doesn't have special meaning in regexp just by typing it there literally:

/’/
Oleg V. Volkov
  • 21,719
  • 4
  • 44
  • 68
  • I've thought of that, but the problem is that if I open it with an editor/IDE that doesn't support "special" characters (like I'm *guessing* `nano`), this character might be lost/replaced. – Eduard Luca Oct 20 '12 at 01:02
  • Yes, using character code is most bulletproof, but I find it hard to believe that any actively updated editor (and `nano`, as I remember, is) doesn't support UTF-8. – Oleg V. Volkov Oct 20 '12 at 12:28
0

Alternative of ’ or ’ in regex in most of environments is

\u2019

however in Perl and PCRE \u is not supported , but \x syntax instead

\x2019

as 2019 is hex of decimal 8217.

Regarding unicode with regex in Javascript read: Javascript + Unicode regexes

Community
  • 1
  • 1
Ωmega
  • 42,614
  • 34
  • 134
  • 203