0

I have a Google Sheets table with input in column A, and I'd want to achieve this result using REGEXEXTRACT.

Desired result:

Input Output
Stock OutNew21554 - Shirt - Red New | Stock Out
NewStock Out54872 - Shirt - Green New | Stock Out

This is what I attempted.

01

=ArrayFormula(REGEXEXTRACT(A1:A2, "[(Stock Out)|(New)]+"))
Input Output
Stock OutNew21554 - Shirt - Red Stock OutNew
NewStock Out54872 - Shirt - Green NewStock Out

02

=ArrayFormula(REGEXEXTRACT(A2:A3, "(Stock Out)|(New)+"))
Input Output
Stock OutNew21554 - Shirt - Red Stock Out
NewStock Out54872 - Shirt - Green
Osm
  • 2,699
  • 2
  • 4
  • 26

2 Answers2

1

Use two instances of regexextract() in an { array expression }, wrapped in iferror():

=arrayformula( iferror( 
  { 
    regexextract(A2:A3, "New"), 
    regexextract(A2:A3, "Stock Out") 
  } 
) )
doubleunary
  • 13,842
  • 3
  • 18
  • 51
  • Hi -- thank you for the response, I wonder if one `regexextract()` can get the result? I removed `iferror()` to test. – Osm Oct 09 '22 at 15:03
  • The question does not specify the need to use just one `regexextract()`, and that limitation seems a bit artificial. It would be understandable if you did not know the number of result columns, i.e., the number of extractions, in advance, but the desired results you show suggest that do you know that. – doubleunary Oct 09 '22 at 15:07
  • Do you suggest to edit the question or add another one? – Osm Oct 09 '22 at 15:10
  • This question already has an answer with a vote, so the best practice is to post a [new question](https://webapps.stackexchange.com/questions/ask). – doubleunary Oct 09 '22 at 15:17
  • 1
    Wouldn't the desired results suggest that it is just a [mre]? The title says "multiple words" and not just two words? – TheMaster Oct 09 '22 at 16:11
1

There's no way to this in a single regex without generating all possible permutations or without lookaround support. However, we can call regexextract repeatedly using REDUCE. For eg, to extract, New,Stock and Color,

=BYROW(A2:A3,LAMBDA(row,TEXTJOIN(" | ",1,REDUCE(,{"New","Stock Out","Red|Green"},LAMBDA(a,c,{a;IFNA(REGEXEXTRACT(row,c))})))))

This supports unlimited number of words to extract.

Output
New | Stock Out | Red
New | Stock Out | Green
TheMaster
  • 45,448
  • 6
  • 62
  • 85
  • Thanks @themaster its much needed workaround, I adjusted it to be case insensitive and separated groups like this `=BYROW(A2:A4,LAMBDA(row,TEXTJOIN(" | ",1,REDUCE("",{"new","Stock Out","Red|Black|Green"},LAMBDA(a,c,{a;IFNA(REGEXEXTRACT(row,"(?i)"&c))})))))` – Osm Oct 10 '22 at 17:24
  • Nice! I think that will always get `New` even when not present in data. How about `reduce( iferror(1/0), { "New", "Stock Out", "Red|Green" }...`? – doubleunary Oct 10 '22 at 17:25
  • @doubleunary does it work the same as `REDUCE("",{...` ? – Osm Oct 10 '22 at 17:27
  • @Osm I don't use `iferror` – TheMaster Oct 10 '22 at 17:31
  • What did you use from doubleunary comment? – Osm Oct 10 '22 at 17:32
  • @Osm The ``initial value`` for ``reduce`` will always be present in output, even without `regexextract`. The `initial_value` is `New`. I just moved it into the `array` argument and made the `initial_value` empty. See post revisions. – TheMaster Oct 10 '22 at 17:34
  • I see `iferror(1/0)` is the same as `REDUCE("",{...` or `REDUCE(,{...`. @TheMaster – Osm Oct 10 '22 at 17:35
  • @Osm Yes, your formula also does it correctly. – TheMaster Oct 10 '22 at 17:36