1

I have a period and backslash escaped string. I would like to split the string using any unescaped periods, but am struggling to come up with a pattern to do so.

const escaped = "two slashes: \\\\.one period: \..line and a dot: \\\.";

// ["two slashes: \\", "one period: .", "line and a dot: \."]
console.log(escaped.split(/* ? */))

This (?<!\\)(?:(\\\\)*)[*] is close, but split() includes capturing groups in the ouput array, which is not what I would like. The solution should be match-only, like here:

(?<!\\)(?:\\\\)*\K\.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
kalkronline
  • 459
  • 2
  • 12
  • @anubhava [It pushes capturing groups to the output,](https://stackoverflow.com/questions/17516040/javascript-regex-split-produces-too-many-items) so the pattern can not use them. – kalkronline Mar 25 '22 at 20:21
  • @anubhava That doesn't solve the issue, it needs to match only the period, using capturing groups matches the capturing groups, whether they are non-captured or not. – kalkronline Mar 25 '22 at 20:28
  • try `/(?<!\\)\K\./` – medilies Mar 25 '22 at 20:33
  • @medilies [Javascript doesn't have \K](https://stackoverflow.com/questions/55874385/javascript-regex-kkeep-substitution) – kalkronline Mar 25 '22 at 20:40
  • I'm searching for an equivalent [1](https://www.phpliveregex.com/p/E8J) [2](https://regex101.com/r/W71bAB/1) – medilies Mar 25 '22 at 20:42
  • [This pattern](https://regex101.com/r/vxxNL6/1) works like I want it to, if there's an equivalent than it would answer my question. – kalkronline Mar 25 '22 at 20:46

1 Answers1

1

The positive lookbehind solution will work with any JavaScript environment compatible with ECMAScript 2018+ standard:

/(?<=(?<!\\)(?:\\\\)*)\./

See this regex demo.

enter image description here

The regex matches any . char that is immediately preceded with any amount of an even amount of backslashes (i.e. even if there are no backslashes before . at all).

With older JavaScript environment, you will need a workaround like text.match(/(?:\\[^]|[^\\.])+/g). See this regex demo. This matches any one or more sequences of a a \ and any single char or any single letter other than a backslash and a dot.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563