Conditions updated
There is often a situation where you want to extract a substring upto (immediately before) certain characters. For example, suppose you have a text that:
- Does not start with a semicolon or a period,
- Contains several sentences,
- Does not contain any "\n", and
- Ends with a period,
and you want to extract the sequence from the start upto the closest semicolon or period. Two strategies come to mind:
/[^;.]*/
/.*?[;.]/
I do either of these quite randomly, with slight preference to the second strategy, and also see both ways in other people's code. Which is the better way? Is there a clear reason to prefer one over the other, or are there better ways? I personally feel, efficiency aside, that negating something (as with [^]
) is conceptually more complex than not doing it. But efficiency may also be a good reason to chose one over the other.