3

the output of

strsplit('abc dcf', split = '(?=c)', perl = T)

is as expected.

However, the output of

strsplit('abc dcf', split = '(?!c)', perl = T)

is

[[1]]
[1] "a" "b" "c" " " "d" "c" "f"

while my expectation is

[[1]]
[1] "a"  "b"  "c " "d"  "cf"

becasue I thought it wouldn't be splited if the last character of previous chunk matches the char c. Is my understanding of negative lookahead wrong?

mt1022
  • 16,834
  • 5
  • 48
  • 71
  • `(?<!c)` is a negative lookbehind, not negative lookahead that is `(?!c)` – Toto Feb 15 '17 at 11:22
  • @Toto. That seems not the issue. `strsplit('abc dcf', split = '(?!c)', perl = T)` give the same output as `strsplit('abc dcf', split = '(?<!c)', perl = T)` and I edited the question a little based on your comment. – mt1022 Feb 15 '17 at 11:25
  • It does not really matter if it is a positive or negative lookahead. What matters is how `strsplit` deals with zero-length matches. – Wiktor Stribiżew Feb 15 '17 at 12:13

1 Answers1

1

We can try

strsplit('abc dcf', "(?![c ])\\s*\\b", perl=TRUE)
#[[1]]
#[1] "a"  "b"  "c " "d"  "cf"
akrun
  • 874,273
  • 37
  • 540
  • 662