Some opening punctuation characters (Unicode General Category Ps
) and opening quote characters (Unicode General Category Pi
) happen to have their appropriate closing character at the very next codepoint. For example, (
is U+0028 and )
is U+0029. Similarly, ⟪
is U+27EA and ⟫
is U+27EB. But there are exceptions, such as «
(U+00AB), which has its matching character, »
, sixteen code points away at at U+00BB.
Given an opening character, how can I determine the appropriate closing character?
(I've tagged this question python
because I ultimately want to accomplish this in Python, but a language-neutral answer is fine, too.)
Edit: Thanks for pointing me to List of all unicode's open/close brackets?. In particular, this answer shows the pairs of brackets (i.e., Ps
and Pe
characters). But the question of finding a matching quote character (i.e., Pi
and Pf
characters) that doesn't happen to be a mirror image, like ’
for ‘
, seems to be left open.