1

This regex matches "so" when it's preceded by a comma and a space:

(,\s)(so)

I want to do the opposite now. I want to tell the regex: don't match "so" if it's preceded by a comma and a space. I tried this (after seeing this SO question):

^(,\s)(so)

But now the regex doesn't match anything: https://regexr.com/4kgq8.

Note: I'm not trying to match beginning of the line.

alexchenco
  • 53,565
  • 76
  • 241
  • 413

3 Answers3

2

Or you can simply use alternation, if that'd be OK, with an expression without lookarounds such as:

, so|(\bso\b)

const regex = /, so|(\bso\b)/gmi;
const str = `So that 
so big that 
, so
and not , so
, so so`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}

If you wish to explore/simplify/modify the expression, it's been explained on the top right panel of regex101.com. If you'd like, you can also watch in this link, how it would match against some sample inputs.


Emma
  • 27,428
  • 11
  • 44
  • 69
1

As anubhava mentions in the comment, with negative lookbehind you could do /(?<!,\s)(so)/, which would match so that is not preceded by a comma and a space (and capturing so). This is a reverse from /(?<=,\s)(so)/, which matches so that is preceded by a comma and a space.

Your regexp /(,\s)(so)/ matches a comma, a space and so (and captures the comma and the space in one group, and so in another). The negation of that can be constructed using a negative lookahead, supported in all browsers, like so: /((?!,\s)..|^.?)(so)/ — it will match two characters (or less, if at the start of the string) that are not a comma and a space, then so (and capture both the non-comma-space preceding characters, and so).

Typically, this second approach has a drawback: when you match more than you want, the restriction against overlapping matches might make you lose a match here and there. However, in this particular case, it is not a problem, since any overlapping characters would be so, not a comma and a space.

(EDIT: I wrote "space" here but all the expressions are written following OP's use of \s, which is actually for whitespace, which includes more than just space.)

Amadan
  • 191,408
  • 23
  • 240
  • 301
1

I want to tell the regex: don't match "so" if it's preceded by a comma and a space.

Best solution is to use a negative lookbehind as I mentioned in my comment below question:

/(?<!, )so/g

Here, (?<!, ) is a negative lookbehind expression that fails the match if a comma and space is present before so.

RegEx Demo 1

Caveat is that lookbehind support in Javascript is only available in modern browsers.


If you want to support legacy or older browsers also then approach will be to use a captured group and discard unwanted match in alternation:

/(?:, so|(\bso))\b/g

RegEx Demo 2

Here a match is defined by presence of capture group #1 in each match. We are matching and discarding unwanted match of ", so in left hand side of alternation. Our desired matches string is on right hand side of alternation which is captured in group #1.

Code:

var arr = ['So that', 
'so big that', 
', so so'];

const regex = /(?:, so|(\bso))\b/g;

arr.forEach((el) => {
  m = regex.exec(el);
  if (m && m.length > 1)
    console.log('Line; [', el, ' ] Start:', regex.lastIndex, m[1])
});
anubhava
  • 761,203
  • 64
  • 569
  • 643