3

When converting text to columns with data.table::tstrsplit I'd like to split on all periods . not encased in doublequotes ". This task boils down to this situation:

test_string <- c('foo.bar.baz', 'fizz.buzz."ba.zz"')
strsplit(test_string, "...", perl = TRUE)

To result in:

[[1]]
[1] "foo" "bar" "baz"

[[2]]
[1] "fizz" "buzz" "ba.zz"

EDIT: based on the linked duplicate I was able to get this:

R> strsplit(test_string, '\\.(?=(?:[^\\"]*\\"[^\\"]*\\")*[^\\"]*$)', perl = TRUE)
[[1]]
[1] "foo" "bar" "baz"

[[2]]
[1] "fizz" "buzz" "\"ba.zz\""
mlegge
  • 6,763
  • 3
  • 40
  • 67

0 Answers0