2

I have a requirement where I am looking for the words "house" and "car" but they have to be within 10 words of each other. I have the following regular expression:

(\b(?=.*[a-zA-Z])(?i)car\b)|(\b(?=.*[a-zA-Z])(?i)house\b)

This works well with any combination of words. However, this doesn't satisfy the "within 10 words of each other" requirement:

Hence, the following would be a good match:

car word1 word2 word3 word4 word5 word6 word7 word8 word9 word10 house
house word1 word2 word3 word4 word5 word6 word7 word8 word9 word10 car

However the following shouldn't be a match:

house word1 word2 word3 word4 word5 word6 word7 word8 word9 word10 word 11 car

car word1 word2 word3 word4 word5 word6 word7 word8 word9 word10 word 11 house

How can I accomplish this? Thanks in advance.

Fast Chip
  • 425
  • 1
  • 4
  • 16

3 Answers3

3

If both words have to be there, but no match between the same words, you might use a capture group for either house or car.

Repeat 1-10 times any word that does not start with either of them and then match either of the words, not being the same word as group 1 using a negative lookahead.

\b(house|car)(?: (?!(?:house|car)\b)\w+){1,10} (?!\1)(house|car)\b

Explanation

  • \b(house|car) A word boundary, capture either house or car in group 1
  • (?: Non capture group
    • (?!(?:house|car)\b)\w+ Negative lookahead, assert what is directly to the right is not either house or car. If that is true, match 1+ word characters
  • ){1,10} Close the group and repeat it 1-10 times
  • (?!\1) Negative lookahead, assert what is directly to the right is not the same word captured in group 1
  • (house|car)\b Capture group 2, match either house or car followed by a word boundary

You can change the quantifier to {0,10} if the word can also be the one that directly follows or change the values to fit the requirement.

Regex demo

If there can be a match between the same words:

\b(house|car)(?: (?!(?:house|car)\b)\w+){1,10} (house|car)\b

regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
0

You can use this /\b((?:house|car)\W+(?:\w+\W+){0,5}?(?:house|car)|(?:house|car)\W+(?:\w+\W+){0,10}?(?:house|car))\b\/g

See example

Wylie Fowler
  • 399
  • 2
  • 12
0

Why not simply try:

(car( [^ ]*){1,10} house)|(house( [^ ]*){1,10} car)

See https://regex101.com/r/rAu6Ku/1

Pierre François
  • 5,850
  • 1
  • 17
  • 38