4

I have been trying to extract the words after this DONE=> sign using REGEX. But its not working.

Your help will be highly appreciated.

I have tried with these formula but its not working

=TRIM(REGEXEXTRACT(A3," => \ +$"))

=ArrayFormula(IFERROR(REGEXEXTRACT(A3:A,"^\*\*\*\ DONE=> - (.* )? ")))

DATA

KIM - 2021 COMPLETE - OLAP - 03-01-2021...DONE=> APWC 02-2021, BCMOI 02-2021, QAF 02-2021, PPN 02-2021,

Result would be like this.

APWC 02-2021, BCMOI 02-2021, QAF 02-2021, PPN 02-2021

2 Answers2

3

You can use

REGEXEXTRACT(A3:A, "DONE=>\s*(.+)\b")

See the regex demo

Details:

  • DONE=> - a string
  • \s* - zero or more whitespaces
  • (.+) - Group 1 (this group value is the return value): any zero or more chars other than line break chars, as many as possible
  • \b - a word boundary (trims the right-hand punctuation).
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Thank you very much. One more favor please if i want to include the `DONE=>` at the start like this result `DONE=> APWC 02-2021, BCMOI 02-2021, QAF 02-2021, PPN 02-2021` is it possible. –  Mar 29 '21 at 18:42
  • 1
    @Mento Yes, just remove the parentheses, `"DONE=>\s*.+\b"` – Wiktor Stribiżew Mar 29 '21 at 18:45
2

Here are a couple of solutions with links to perm. files / screenshots.

A) RegEx - re2 (Sheets)

RegEx expression

See here for the file assoc. with screenshot above.

B) Mid / Search Formula

=Mid(B4,search($C$3,B4)+len($C$3)+1,len(B4))

where:

  • First 'data' point is in cell B4 (i.e. sentence you provided "KIM - 2021 COMPLETE - OLAP - 03-01-2021...DONE=> APWC...."
  • Desired 'trigger' word in cell C3 (e.g. "DONE=>") - you can change this to whatever you wish (only picks up first occurrence, and returns everything after it as req.
  • Text returned: column C (first result in cell C4)

How it works

  1. Mid(target,a,b) returns b characters from text in target cell, starting from character in the 'ath' position
  2. E.g. with target = "Hello", a = 2 and b = 3, Mid("Hello",2,3) = "ell" (without quotation marks)
  3. For your case, using a = search("DONE=>") returns text from this point (start of word), so a = search("DONE=>")+ len("DONE=>") + 1 should place the first character of the resulting string to return immediately after the first occurrence of the word "DONE=>"
  4. len(string) simply returns the length of the variable defined as string. E.g. len("Hello") = 5. Since the number of characters following "DONE=>" cannot possibly exceed the total number of characters in the original text (i.e. your 'Data'), Excel will automatically limit to the maximum available characters, and so you can rest assured that all characters after "DONE=>" are returned (assuming this word features in the Data)
  5. Iferror and Trim are just to ensure cleaner results (e.g. missing "DONE=>", excess space characters etc.)

Screenshot

Mid-Search formulation

Formulation of mid / search example

C) RegEx variations

Sheets use re2 for regex. Ordinary RegEx would look like this:

(DONE=>\K)..+

See here (perma-link available for conversion to python/java etc.). E.g. in Python this would work as follows:

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"(DONE=>\K)..+"

test_str = "KIM - 2021 COMPLETE - OLAP - 03-01-2021...DONE=> APWC 02-2021, BCMOI 02-2021, QAF 02-2021, PPN 02-2021"

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):
    
    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
    
    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1
        
        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

D) Related Links

JB-007
  • 2,156
  • 1
  • 6
  • 22
  • Thank you very much @James for the answer and elaborating this thing. –  Mar 30 '21 at 11:01
  • 1
    Thanks for recognition - I think if I got back on this a little sooner it would have scored a few more upvotes :) but all good - knowledge in community achieved in any case - main thing...! – JB-007 Mar 30 '21 at 19:47
  • Yes ! I appreciate you response you brilliantly provided sooner or later community will appreciate it. –  Mar 31 '21 at 11:21