5

How could a JavaScript RegEx be written, so that it matches for example the word cube, but only if the word small is not present in the 20 character range before this word.

RegEx should match:

  • cube
  • red cube
  • wooden cube
  • small................cube

RegEx should not match:

  • small cube
  • small red cube
  • small wooden cube
  • ..........small......cube
  • any sphere

Currently my regex looks and works like this:

> var regex = /(?:(?!small).){20}cube/im;
undefined
> regex.test("small................cube")     // as expected
true
> regex.test("..........small......cube")     // as expected
false
> regex.test("01234567890123456789cube")      // as expected
true
> regex.test("0123456789012345678cube")       // should be `true`
false
> regex.test("cube")                          // should be `true`
false

There must be 20 characters in front of cube, where each is not the first character of small. But here is the problem: If cube appears within the first 20 characters of a string, the RegEx does not match of course, because there are not enough characters in front of cube.

How can the RegEx be fixed, to prevent these false negatives?

hiddenbit
  • 2,233
  • 14
  • 25
  • You have 2 options: reverse the string and reverse your pattern by using a lookahead or take a look at this [golden post](http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript) – HamZa Jul 15 '14 at 14:50
  • Maybe this post can be useful http://stackoverflow.com/questions/641407/javascript-negative-lookbehind-equivalent – Federico Piazza Jul 15 '14 at 15:01
  • Thanks so far! To clarify: This RegEx should be usable as part of other RegExes. Reversing is not an option, because then lookaheads wouldn't be working any more. Using an optional matching group and look whether it contains something seems to be an easy solution for many cases, but only works if it is not used as sub-RegEx. I would prefer a RegEx-only solution, if possible. – hiddenbit Jul 15 '14 at 18:14

2 Answers2

2

You can use this regex:

.*?small.{0,15}cube|(.*?cube)

And use matched group #1 for your matches.

Online Regex Demo

Community
  • 1
  • 1
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Can you explain more about how the regex engine works to match your expression or the technique you used to do it? I really like your approach. – Federico Piazza Jul 16 '14 at 16:38
  • 1
    Actually this is simple technique to circumvent the regex engine's capabilities to disallow variable length lookbehind. In this regex we match whatever we don't need using pipe (OR) construct on left hand side and finally leave the right most match in the pipe using a captured group. – anubhava Jul 16 '14 at 16:49
  • Thank you for this idea. Although I would prefer a solution without the additional step where I have to look "from outside" at the matched result again, this seems to be a quite simple way, that works. So until no other plain RegEx solution is posted, this seems to be the best solution. – hiddenbit Jul 17 '14 at 17:59
0

I have investigated about this since it seemed to be easy but it was definitely more difficult as I thought.

My idea was to try a negative lookbehind regex like:

(?<!small).{0,20}cube

But this didn't work and of course javascript doesn't support negative lookbehind.

So, I was trying a different technique and it could solve many cases like the following:

cube                      -> match
red cube                  -> match
wooden cube               -> match

small cube                -> not matched
small red cube            -> not matched
small wooden cube         -> not matched
..........small......cube -> not matched
any sphere                -> not matched

The idea is to do the following:

var newString = "cube" // <-- Change here the string your want to test
    .replace(/(small)?.{0,20}cube/g, function ($0, $1) { return $1?$0:"[match]"; });

and then compare newString to [match]. If it is different then your string didn't match.

I have struggled with some cases that should match but didn't, like:

small................cube
small.......cube

There is something wrong with .

I know that this doesn't fully answer to your question, but I wanted to share with you this approach since community can see this and help improving the answer, or give ideas to provide better answers

Federico Piazza
  • 30,085
  • 15
  • 87
  • 123