1

I have the following regex:

/(1+(?=0|$))|(0+(?=1|$))/

I use it on strings composed of only 0 and 1 characters to split the string in groups of 0s and 1s. Here my regex matches groups of 1s followed by a 0 or an end of line 1+(?=0|$) or groups of 0s followed by a 1 or an end of line 0+(?=1|$).

> '001100011101'.split(/(1+(?=0|$))|(0+(?=1|$))/).filter(c=>!!c)
[ '00', '11', '000', '111', '0', '1' ]

I am wondering if I could get the same result using something like /([01])+(?=[^$1]|$)/ to match 0s or 1s in a capture group (([01])) followed by not what was in the capture group [^$1]. This syntax clearly doesn't work and I can't find if something like this is possible or not.

statox
  • 2,827
  • 1
  • 21
  • 41

3 Answers3

3

The backreference needs to be outside of the character set (character sets don't accept backreferences like that).

In the pattern, you also need \1, not $1, and I don't think there's any need for the lookahead:

console.log(
  '001100011101'.match(/([01])(?:\1)*/g)
);
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
2

If your string consists only of 0 and 1, and you want to match sequences of 0 and 1, you don't need any groups.

You could either match 1 or more zeroes, or 1 or more one's using an alternation.

0+|1+

Regex demo

console.log('001100011101'.match(/0+|1+/g));
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • Thank you for your answer, that doesn't answer my exact question but your logic is right and that's what I will end up using. – statox Jan 19 '21 at 14:52
1

There are three things to consider then:

You can use

console.log('001100011101'.match(/([01])+?(?=(?!\1).|$)/g))
// Further simplified to
console.log('001100011101'.match(/([01])\1*/g))

Details:

  • ([01])+? - one or more 0 or 1 chars, as few as possible, while capturing the digit into Group 1
  • (?=(?!\1).|$) - a positive lookahead that matches a location that is immediately followed with any char but the one captured in Group 1 or end of string
  • ([01])\1* - a 0 or 1 captured into Group 1 and then 0 or more repetitions of the captured digit.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563