1

How can I find all words that have no specific character before?

For example, if I want to match all apple, which have no any character b before it, how should I do?

dolphin elephant apple star    <-- Matched
dog cat apple banana            <-- Matched
map banana apple dog           <-- Unmatched (Since there's a b before the apple)
map apple banana apple cat <-- The first apple matched, but the second one unmatched.
map apple banana apple banana apple <-- Only the first apple matched, others are unmatched.
map apple dog apple banana apple banana apple <-- The first apple and the second apple matched, others are unmatched.

Here's my try:

/(?<!.*b.*)apple/g

And of course, the regex above is invalid, since the quantifier (asterisk in this case) inside the lookbehind makes it non-fixed width. So how should I do to solve this problem?

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
blue cat
  • 173
  • 1
  • 11

3 Answers3

1

It has been mentioned already that there is no lookbehind available in JS regex. To me it reads like you want to match and eventually replace the matched words before the specific character.

I would split the string at the first occurence and capture the split sequence. Then match/replace only in the first part and rejoin the parts afterwards. [^]* matches just any characters.

// Test strings
var strs = ['dolphin elephant apple star','dog cat apple banana','map banana apple dog',
'map apple banana apple cat','map apple banana apple banana apple',
'map apple dog apple banana apple banana apple'];

// Split string at separator - Replace in first part - Rejoin
for (var str of strs) {
  var parts = str.split(/(b[^]*)/);
  parts[0] = parts[0].replace(/\b(apple)\b/g, '<b>$1</b>');
  var new_str = parts.join('');
  
  // Check result
  console.log(new_str);
}
bobble bubble
  • 16,888
  • 3
  • 27
  • 46
0

First, search for the first occurence of the character. Then take the substring from 0 to that index and match the pattern against the substring. If the character is not found, then just search the whole string.

Leo Aso
  • 11,898
  • 3
  • 25
  • 46
0

Regex flavors differ. What you cannot do with a pure regex can usuaully be compensated with the code.

.NET, Python PyPi regex engines support infinite width lookbehind patterns, your approach will work there (see this regex demo).

In Java, (?<!b.{0,1000})apple will work as Java regex engine supports a constrained-width lookbehind pattern (tested at OCPSoftware regex tester).

In PHP, you may use known (*SKIP)(*FAIL) PCRE verbs to skip what you do not need, use b.*?apple(*SKIP)(*F)|apple).

In JavaScript and Python re, use an optional capturing group and check if it was matched. If it matched, the match should be discarded, else, grab it.

Here is a JS implementation (see the regex demo):

var ss = ['dolphin elephant apple star','dog cat apple banana','map banana apple dog','map apple banana apple cat','map apple banana apple banana apple','map apple dog apple banana apple banana apple'];
var rx = /(b.*?)?apple/g;
for (var s of ss) {
  console.log("Testing '"+s+"'.....");
  var m;
  while(m=rx.exec(s)) {
     if (!m[1]) console.log(m[0]," at ", m.index);
  }
  console.log("===================");
}
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563