You want the best regex trick (+ (*SKIP)(*FAIL)
) and a tempered greedy quantifier:
(w\hatev[e3]rY0uD()nt\Want\toMatch)
(*SKIP)(*FAIL)
|
(?:(?!(?1)).)+
(?1)
is a recursive pattern which matches the expression enclosed in the first group. You can replace it with the expression itself if your flavor does not support recursion. The same goes with (*SKIP)(*FAIL)
: Use whatever you have in your language to forfeit the match if the first group is not null
or similar.
This has some particular advantages over splitting the string with w\hatev[e3]rY0uD()nt\Want\toMatch
. For example, (ab)(*SKIP)(*FAIL)|(?:(?!(?1)).)+
matches the following:
abbacaba
ababcdeab
Try it on regex101.com.
Since +
only matches 1 or more characters, there is no filtering needed. On the other hand, your language's equivalent of .split()
, if any (I'm looking at you, Lua), will typically return even the empty strings. Take Python for example:
import re
print(re.split('ab', 'ababcdeab')) # ['', '', 'cde', '']
If you want to match single characters, simply drop the quantifier:
(w\hatev[e3]rY0uD()nt\Want\toMatch)
(*SKIP)(*FAIL)
|
(?!(?1)).
Try it on regex101.com.
On the other hand, this trick may not worth it. Just do a split and filter out anything you don't want, for your colleagues' sake.