When constructing a regular expression for matching a list of candidate strings, how to ensure all the strings can be matched? For example,
This regular expression (?:O|SO|S|OH|OS)(?:\s?[-+*°.1-9]){0,4}
can match all the examples below
O 4 2 -
O 2 -
SO 4 * - 2
S 2-
However, if I swap S and SO, the resulting regular expression (?:O|S|SO|OH|OS)(?:\s?[-+*°.1-9]){0,4}
failed to match the SO 4 * - 2
as a whole, instead it is separated into two matches: S
and O 4 * - 2
.
So my confusion is how to order the list of candidate strings in the regular expression, so that all of them can be safely and uniquely matched? Since the actual list of candidate strings in my project is a bit more complicated than the example, is there a sorting algorithm that can achieve this?