7

In data.table v.1.9.6 you can split a variable in columns like so:

library(data.table)
DT = data.table(x=c("A/B", "A", "B"), y=1:3)
DT[, c("c1", "c2") := tstrsplit(x, "/", fixed=TRUE)][]

The number of required splits [above: 2] is not always known in advance. How can I generate the required variable names when the number of splits is known?

n = 2  # desired number of splits
# naive attempt to build required string
m = paste0("'", "myvar", 1:n, "'", collapse = ",")
m = paste0("c(", m, ")" )

# [1] "c('myvar1','myvar2','myvar3')"


DT[, m := tstrsplit(x, "/", fixed=TRUE)][]  # doesn't work
Henk
  • 3,634
  • 5
  • 28
  • 54

2 Answers2

10

Two methods. The first is strongly suggested:

#one
n=2
DT[, paste0("myvar", 1:n) := tstrsplit(x, "/", fixed=T)][]
#     x y myvar1 myvar2
#1: A/B 1      A      B
#2:   A 2      A     NA
#3:   B 3      B     NA

#two
DT[, eval(parse(text=m)) := tstrsplit(x, "/", fixed=TRUE)][]
#     x y myvar1 myvar2
#1: A/B 1      A      B
#2:   A 2      A     NA
#3:   B 3      B     NA 

extra

If you do not know the amount of splits beforehand:

splits <- max(lengths(strsplit(DT$x, "/")))
DT[, paste0("myvar", 1:splits) := tstrsplit(x, "/", fixed=T)][]
Pierre L
  • 28,203
  • 6
  • 47
  • 69
  • 1
    If you specify more splits then are possible it will recycle. They may have a typo in their question: `"How can I generate the required variable names when the number of splits is known?"`. Maybe they meant to write "unknown". – Pierre L Oct 18 '15 at 16:28
  • they meant known. have edited question to make it [hopefully] cleared by setting n = 2. – Henk Oct 18 '15 at 16:29
  • Then the above answer will do it. – Pierre L Oct 18 '15 at 16:32
4

Another simple way of doing this. Instead of making extra columns, you can stack the splitted strings in a single column:

DT = data.table(x=c("A/B", "A", "B"), y=1:3)

DT1 <- DT[, .(new=tstrsplit(x, "/",fixed=T)), by=y]
DT1

#    y new
# 1: 1   A
# 2: 1   B
# 3: 2   A
# 4: 3   B
user3576212
  • 3,255
  • 9
  • 25
  • 33