0

I need to extract the substring between "Design Brands > " and the first following pipe(|) character from of the following string:

"T-shirts|Brands > Port & Company|Design Brands > Montana Griz|Designs > TeamLB Griz > MTG31|T-shirts > TeamLB|T-shirts > Montana Griz"

This is within a google sheets function so I have to use Go's RE2 syntax

I would expect that the following expression would work

Design Brands > (.*)\|

However, the expression matches everything up to the last pipe in the string "Montana Griz|Designs > TeamLB Griz > MTG31|T-shirts > TeamLB" instead of everything up to the first occurance of a pipe in the string. I cant seem to figure out how to isolate just "Montana Griz" within a capture group.

Emma
  • 27,428
  • 11
  • 44
  • 69
Colin Worf
  • 19
  • 1
  • 7
  • 1
    How about making the `.*` [lazy](https://www.rexegg.com/regex-quantifiers.html#lazy_solution): [`Design Brands > (.*?)\|`](https://regex101.com/r/R15bm7/1/) – bobble bubble Jun 07 '19 at 01:11
  • Not a duplicate, the lazy dot doesn't work in RE2. The working solution to this problem is not posted anywhere on stack overflow. I spent hours trying to find one. – Colin Worf Jun 08 '19 at 02:27

1 Answers1

1

Either make the dot lazy:

Design Brands > (.*?)\|

Or, if RE2 does not support lazy dot, then use this version:

Design Brands > ([^|]*)\|

Demo

The second pattern says to:

Design Brands >    match "Design Brands > "
([^|]*)            then match and capture any character which is NOT pipe
\|                 finally match the first pipe

The ([^|]*) is a trick for matching all content up to, but including, the first pipe which comes along.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360