5

I have a Google Sheet with a list of transactions. I need to isolate up to three words or numbers in the column E that come after the word "end" in the column D. The proper regex function should be the positive lookbehind, but it is not supported in RE2 (source: Using positive-lookahead (?=regex) with re2).

This formula returns an error in GSheets:

=REGEXEXTRACT(D1;"(?<=end\s)(\w+)")

So my conclusion is that regex is a dead end in this case.

How do I obtain the requested result in GSheets?

Daniel Morell
  • 2,308
  • 20
  • 34
knuckle_sandwich
  • 207
  • 1
  • 6
  • 12
  • Can't you just use `=REGEXEXTRACT(D1;"end\s*(\w+)")`? Could you please provide an example of the string in D1 and expected result? If you need 1, 2 or 3 whitespace separated words, you will need something like `"end\s*(\w+(?:\s+\w+){0,2})"` – Wiktor Stribiżew Dec 22 '18 at 16:40

2 Answers2

6

You may use a capturing group in your regex to make REGEXEXTRACT return just that captured part:

=REGEXEXTRACT(D1;"end\s*(\w+)")

If you need to return 1, 2 or 3 whitespace-separated words after end, use

=REGEXEXTRACT(D1;"end\s*(\w+(?:\s+\w+){0,2})")

See the online demo (Golang regex also uses RE2).

Details

  • end - end
  • \s* - 0+ whitespaces
  • (\w+(?:\s+\w+){0,2}) - Capturing group 1:
    • \w+ - 1+ word chars (letters, digits or _)
    • (?:\s+\w+){0,2} - 0, 1 or 2 occurrences of
      • \s+ - 1+ whitespaces
      • \w+ - 1+ word chars.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
4

If you desperately need a different RegEx backend other than RE2, you can use an App Script to create a custom function that uses JS to evaluate the RegEx.

  1. Click Tools > Script Editor.
  2. In the editor add your custom JS RegEx function. You can use the example below.
  3. Give it a name JS_REGEXEXTRACT is a good option.
  4. Click save button in the toolbar.
  5. Go back to the browser tab with your Sheet in it, and replace REGEXEXTRACT with JS_REGEXEXTRACT.

You now have a working JS base RegEx option. It will not be as fast as the RE2 implementation, so be careful on large datasets with complex expressions.

/**
 * Extracts matching substrings according to a regular expression using JS not RE2.
 *
 * @param {"abcdefg"} text The input text.
 * @param {"c.*f"} regular_expression The first part of `text` that matches this expression will be returned.
 * @return Extracts matching substrings according to a regular expression.
 * @customfunction
 */
function JS_REGEXEXTRACT(text, regular_expression) {
  if (text.length < 1) {
    return null;
  }
  matches = text.match(new RegExp(regular_expression))
  if (matches && matches.length >= 1) {
    return matches[1];
  }
  return matches;
}
Daniel Morell
  • 2,308
  • 20
  • 34