You may use \p{L}
with the modern ECMAScript 2018+ compliant JavaScript environments, but you need to remember that the Unicode property classes are only supported when you pass u
modifier/flag:
a.match(/\p{L}+/gu)
a.match(/\p{Alphabetic}+/gu)
will match all occurrences of 1 or more Unicode letters in the a
string.
NOTE that \p{Alphabetic}
(\p{Alpha}
) includes all letters matched by \p{L}
, plus letter numbers matched by \p{Nl}
(e.g. Ⅻ
– a character for the roman number 12
), plus some other symbols matched with \p{Other_Alphabetic}
(\p{OAlpha}
).
There are some things to bear in mind though when using u
modifier with a regex:
- You can use Unicode code point escape sequences such as
\u{1F42A}
for specifying characters via code points. Normal Unicode escapes such as \u03B1
only have a range of four hexadecimal digits (which equals the basic multilingual plane) (source)
- "Characters of 4 bytes are handled correctly: as a single character, not two 2-byte characters" (source)
- Escaping requirements to patterns compiled with
u
flag are more strict: you can't escape any special characters, you can only escape those that can actually behave as special characters. See HTML input pattern not working.