0

I want to split strings by id. I tried:

library(stringr)
library(data.table)    
dt <- data.table(text = c("ABC", "DEF GHI JKL", "MON PQ"))
dt [, id:= .I]
dt [, splitted := str_split(text, " ") |> unlist(), by = .(id)]

I would expect to get:

id  splitted
1   ABC
2   DEF
2   GHI
2   JKL
3   MON
3   PQ

However, I get the error:

Error in `[.data.table`(dt, , `:=`(splitted, unlist(str_split(text, " "))),  : 
  Supplied 3 items to be assigned to group 2 of size 1 in column 'splitted'. The RHS length must either be 1 (single values are ok) or match the LHS length exactly. If you wish to 'recycle' the RHS please use rep() explicitly to make this intent clear to readers of your code.

I would expect id to recycle automatically. How do I fix it? Thanks.

Fabio Correa
  • 1,257
  • 1
  • 11
  • 17

1 Answers1

0
dt[, text2 := strsplit(text, " ") ]
dt2 <- dt[, .(id = rep(seq_along(text2), times = lengths(text2)),
              text = unlist(text2)) ]
dt2
#       id   text
#    <int> <char>
# 1:     1    ABC
# 2:     2    DEF
# 3:     2    GHI
# 4:     2    JKL
# 5:     3    MON
# 6:     3     PQ
r2evans
  • 141,215
  • 6
  • 77
  • 149