0

I want to split V2 column to two columns by ')*('

V1                                  V2
r1      (Direct)*(Mary*(Sewnf 45*S-a))
r2 (Ax 70a12*Qunion)*(Kin - 32431*Tip)
r3           (PAN*Q-23)*(BE 05/514/10)

then I can see below.

V1                V2                          V3
r1           (Direct        Mary*(Sewnf 45*S-a))
r2  (Ax 70a12*Qunion            Kin - 32431*Tip)
r3         (PAN*Q-23               BE 05/514/10)

Here is something I have tried, but apparently it does not get my goal.

library(stringr)
str_split_fixed(as.character(data$V2), '\\)*(', 2)
str_split_fixed(as.character(data$V2), '\\)*\\(', 2)

and also trying.

strsplit(as.character(data$V2), '\\)*(')

How can I revise my script?

2 Answers2

4

We can do this with separate by specifying the sep to match a ) followed by a * and a ( (these are metacharacters i.e. () can be used for capturing as a group while * implies 0 or more characters, so it needs to be escaped (\\) to parse the literal character. With extra=merge, it splits only at the first instance of this match and others are merged into the second column i.e. 'V3'

library(tidyr)
separate(df1, V2, into = c("V2", "V3"), "\\)\\*\\(", extra = "merge")
#  V1               V2                   V3
#1 r1          (Direct Mary*(Sewnf 45*S-a))
#2 r2 (Ax 70a12*Qunion     Kin - 32431*Tip)
#3 r3        (PAN*Q-23        BE 05/514/10)

In the OP's code, all the metacharacters were not escaped

akrun
  • 874,273
  • 37
  • 540
  • 662
4
library(stringr)
data[,c("V2","V3")] <- str_split_fixed(as.character(data$V2), ")*(", 2)

This should work!

Kalees Waran
  • 659
  • 6
  • 13