How can I adjust this regex to pick only one match in an alternation phrase please. I'm using the Google Sheets REGEXEXTRACT function to collect parts of a string. I have almost all conditions covered by using alternation - except one, the one in which the source string includes more than one of the alternation matches.
A simple test example, in B1:B4 I have.
=REGEXEXTRACT(A1,"([^\d]*)\s?(?:vs|bc)")
=REGEXEXTRACT(A2,"([^\d]*)\s?(?:vs|bc)")
=REGEXEXTRACT(A3,"([^\d]*)\s?(?:bc|vs)")
=REGEXEXTRACT(A4,"([^\d]*)\s?(?:bc|vs)")
(note the inverted bc|vs alternation in B3 & B4)
In A1:A4 I have
oiu tgi vs ghj bc fgh
de nbv the vs ghj fgh
ghj YHTF bc fgh
iii iii kji bc fgh vs oiu
The results I want in B1:B4
oiu tgi
de nbv the
ghj YHTF
iii iii kji
The problem is the engine (quite correctly) matches both alternation phrases and extracts everything to the left of the last match (rightmost) in the source string. Not what I want. The various text lengths between the alternation matches is random. Is my whole approach wrong?
The results I actually get in B1:B4 are:
oiu tgi vs ghj
de nbv the
ghj YHTF
iii iii kji bc fgh
As the number of rows in the real data is many thousands I obviously want a universal regex for all variants.