-2

How can I adjust this regex to pick only one match in an alternation phrase please. I'm using the Google Sheets REGEXEXTRACT function to collect parts of a string. I have almost all conditions covered by using alternation - except one, the one in which the source string includes more than one of the alternation matches.

A simple test example, in B1:B4 I have.

=REGEXEXTRACT(A1,"([^\d]*)\s?(?:vs|bc)")
=REGEXEXTRACT(A2,"([^\d]*)\s?(?:vs|bc)")
=REGEXEXTRACT(A3,"([^\d]*)\s?(?:bc|vs)")
=REGEXEXTRACT(A4,"([^\d]*)\s?(?:bc|vs)")

(note the inverted bc|vs alternation in B3 & B4)

In A1:A4 I have

oiu tgi vs ghj bc fgh
de nbv the vs ghj fgh
ghj YHTF bc fgh
iii iii kji bc fgh vs oiu

The results I want in B1:B4

oiu tgi
de nbv the
ghj YHTF
iii iii kji

The problem is the engine (quite correctly) matches both alternation phrases and extracts everything to the left of the last match (rightmost) in the source string. Not what I want. The various text lengths between the alternation matches is random. Is my whole approach wrong?

The results I actually get in B1:B4 are:

oiu tgi vs ghj 
de nbv the 
ghj YHTF 
iii iii kji bc fgh 

As the number of rows in the real data is many thousands I obviously want a universal regex for all variants.

DeeKay789
  • 353
  • 2
  • 4
  • 8

1 Answers1

0

See if this works

=ArrayFormula(regexreplace(A1:A4, "(\s(?:bc|vs).+$)",))
JPV
  • 26,499
  • 4
  • 33
  • 48
  • Thx JPV. Perfect. Marked as a correct answer (bit late, sorry). – DeeKay789 Jul 06 '19 at 18:54
  • btw. my question related specifically to regex in Google Sheets. For those following, the link provided by W Stribizew did not provide an answer, just a clue. – DeeKay789 Jul 06 '19 at 18:55