1

Using Google re2 https://github.com/google/re2/blob/master/doc/syntax.txt

From a couple of lines like

  1. I love Rock
  2. I love Rock and scissors
  3. I hate paper
  4. I like Rock, paper and scissors
  5. I'd love myself

I want to extract "Rock", "paper"and "scissors" from each line. I want the regex to match all the above five lines and give me Rock, paper and scissors where it found something. I'm predominantly using this in Google sheets, but any Google re2 regex should help.

I've tried:

".*(([Rock]{0,4})).*"

".*(([Rock]{4})|([Rock]{0})).*"

=REGEXEXTRACT(A2,".*(Rock{0,2}).*(paper{0,2}).*(scissors{0,2}).*")

and multiple other combinations to get Rock from any line, if present... But, it always prefers zero rather than four... Even If it finds Rock , it returns empty strings. If i replace {0} with {1} i get "k" even though the full Rock is found.

Any ideas?

TheMaster
  • 45,448
  • 6
  • 62
  • 85
  • `[Rock]{0,4}` matches empty or 1 to 4 of the characters in the character class. – Toto Aug 02 '17 at 09:23
  • @Toto , Yes. If no Rock is found, it should give me empty, so that i don't get the error->the whole regex expression doesn't match the string. So, that it may move on to search for paper in the given string. Main problem is preferance. It should give preference to 4 characters rather than zero. – TheMaster Aug 02 '17 at 09:27
  • Because of the greedy `.*` arround. If you want to match `Rock` in a string, just use `\bRock\b` without anything else. – Toto Aug 02 '17 at 09:29
  • @toto i want all the 5 lines above including the fifth one to match the same regular expression, but capture the three words, if present. – TheMaster Aug 02 '17 at 09:41
  • Your \bRock\b won't match the fifth one and I'll get an error. That's why i want quantifiers for whole words. – TheMaster Aug 02 '17 at 09:42
  • `\b(?:Rock|paper|scissors)?\b` – Toto Aug 02 '17 at 09:48
  • "\b(?:Rock|paper|scissors)?\b" gives me a empty result too... Please try using this in a Google sheet – TheMaster Aug 02 '17 at 09:54
  • 2
    =REGEXEXTRACT(A16,"\b(?:Rock|paper|scissors)?\b") I tried removing the non capturing too.. Still the same result... – TheMaster Aug 02 '17 at 09:55

2 Answers2

5

I found some regex features are not supported in Google Sheets so far.

Please, try this workaround:

=ArrayFormula(IFERROR(REGEXREPLACE(A3,REGEXREPLACE(A3,"(Rock|paper|scissors)","(.*)"),{"$1","$2","$3"})))


In step 1 this formula makes regex for step2:

enter image description here

Max Makhrov
  • 17,309
  • 5
  • 55
  • 81
  • 1
    I did start my post with Google re2 GitHub to make known the limitations. Your formula works perfectly – TheMaster Aug 03 '17 at 09:23
0

[Rock] will search for letters R o c k. Instead, use (Rock)

Cid
  • 14,968
  • 4
  • 30
  • 45
  • 1
    I tried... Though it'll give me a perfect "Rock" match, i cannot quantify it... There may be zero rocks in the sentence and it'll give me a error. – TheMaster Aug 02 '17 at 09:30
  • What about `(Rock){0, 4}` ? – Cid Aug 02 '17 at 09:32
  • =REGEXEXTRACT(A2,".*(Rock{0,2}).*(paper{0,2}).*(scissors{0,2}).*") Also gives me a zero when 1 Rock is found-it prefers zero match to a correct match – TheMaster Aug 02 '17 at 09:32