In certain flavours of regular expression, inside a square bracket expression, the =
symbol is a special character that is used as a delimiter for enclosing any one the elements within an equivalence class. The documentation says the following:
An equivalence class expression shall represent the set of collating elements belonging to an equivalence class, as described in Collation Order. Only primary equivalence classes shall be recognized. The class shall be expressed by enclosing any one of the collating elements in the equivalence class within bracket-equal ( "[=" and "=]" ) delimiters. For example, if 'a', 'à', and 'â' belong to the same equivalence class, then "[[=a=]b]", "[[=à=]b]", and "[[=â=]b]" are each equivalent to "[aàâb]". If the collating element does not belong to an equivalence class, the equivalence class expression shall be treated as a collating symbol.
I am not quite sure what this means. If a
, à
and â
belong to the same equivalence class, does that mean we wish to specify that the regular expressions "[ab]"
, "[àb]"
and "[âb]"
are equivalent? Then what is the purpose of using the [=
=]
delimiter, since we might as well write "[aàâb]"
?
I understand what an "equivalence class" means in its general definition, but I am unable to grasp its meaning in this context.