How to get, for example..., a code point pattern like x-y\uxxxx\Uxxxxxxxxx
from the Connector Punctuation (Pc) category, for scanning ECMAScript 3/JavaScript identifiers?
Original question
I need help for verifying a valid character (code point) of a ECMA-262 (3º edition, 7.6) identifier for a lexical scanner.
Syntax quote
Identifier
::
IdentifierName
but notReservedWord
IdentifierName
::
IdentifierStart
IdentifierName
IdentifierPart
IdentifierStart
::UnicodeLetter
- $
- _
\# no need to check thisUnicodeEscapeSequence
IdentifierPart
::
IdentifierStart
UnicodeCombiningMark
UnicodeDigit
UnicodeConnectorPunctuation
UnicodeLetter
::
- any character in the Unicode categories “Uppercase letter (Lu)”, “Lowercase > letter (Ll)”, “Titlecase letter (Lt)”, “Modifier letter (Lm)”, “Other letter (Lo)”, or “Letter number (Nl)”.
UnicodeCombiningMark
::
- any character in the Unicode categories “Non-spacing mark (Mn)” or “Combining spacing mark (Mc)”
UnicodeDigit
::
- any character in the Unicode category “Decimal number (Nd)”
UnicodeConnectorPunctuation
::
- any character in the Unicode category “Connector punctuation (Pc)”
As you can see, it takes any character of certain categories.
I need to have all these possible characters, so my first step was to locate "Connector punctuation" on this Unicode 5.0 chart, but 0 matches and I believe I'm doing it the wrong way. So could someone help me?