-1

Let's say I've got the following string:

"dog cat hello cat dog dog hello cat world"

and the two words "hello" and "world".

I want to get the string that is between these words and where the two words are closest (in terms of number of words between them) to each other. In this example the following strings would be between these two words:

  1. "cat dog dog hello cat"
  2. "cat"

Since "hello" and "world" are the closest in the 2nd option, the desired result would be "cat" in this example.

How do I do this in regEx (in JS flavor)?

The best I could come up is

(?<=hello\s+).*?(?=\s+world)

but that only gives me the 1st option, i.e. "cat dog dog hello cat"

thadeuszlay
  • 2,787
  • 7
  • 32
  • 67
  • Why, no this can't be determined with regex alone. So, that's the answer. Regex cannot count, or tell distances relative to something else without hard boundary's. All the answers given here are just bunk, totally wrong. !! –  Apr 04 '19 at 22:12
  • what is your suggestion @sln – thadeuszlay Apr 04 '19 at 22:13
  • I would use a regex callback to determine the length's (or number of words) between them. The regex I'd use is `/hello(?=(.*?)world)/g` then _keep the substring (group 1)_ with the shortest criteria, until no more matches.\ –  Apr 04 '19 at 22:19

2 Answers2

1

You may use this regex using a negative lookahead:

/\bhello\s+((?:(?!\bhello\b).)*?)\s+world\b/

RegEx Demo

(?:(?!\bhello\b). matches a character that doesn't have world hello at next position

anubhava
  • 761,203
  • 64
  • 569
  • 643
0

You can use .*\bhello in the beginning of your pattern to greedily consume characters up to the last occurrence of hello, so that what you want would be in the capture group, without a hello or a world inside:

.*\bhello\s+(.*?)\s+world\b

Demo: https://regex101.com/r/FYZAdX/3

blhsing
  • 91,368
  • 6
  • 71
  • 106
  • 1
    it wouldn't work with these sentences: `dog cat hello cat dog dog hello cat world world` and `dog cat hello cat dog world dog hello cat world world` – thadeuszlay Apr 04 '19 at 21:28
  • I see. Fixed with a lazy repeater instead then. – blhsing Apr 04 '19 at 21:30
  • This is just matching last substring between `hello` and `world` e.g. `hello foo world dog hello cat cat dog dog world world`. Here it will match `hello cat cat dog dog world` – anubhava Apr 04 '19 at 21:34
  • No, what you look for is in the capture group, not the match. See demo: https://regex101.com/r/FYZAdX/4 (look for the capture group on the right of the page) – blhsing Apr 04 '19 at 21:36