Split by separator into many columns

Question

How is it possible from a dataframe which contains one column which contains values separated by ";"

df  <-data.frame(id = c(1,2,3), stock = c("Google;Yahoo","Microsoft;Google","Yahoo"))

produce a dataframe like this:

df  <-data.frame(id = c(1,2,3), stock_1 = c("Google","Microsoft","Yahoo"), stock_2 = c("Yahoo","Google","NA"))

`library(splitstackshape); cSplit(df, "stock", ";")`. – A5C1D2H2I1M1N2O1R2T1 Dec 26 '17 at 13:42 — A5C1D2H2I1M1N2O1R2T1, Dec 26 '17 at 13:42

score 4 · Accepted Answer · edited Dec 26 '17 at 17:04

1) separate_rows Use separate_rows to convert it to long form, add a name column containing the eventual column names and use spread to convert it back to wide form.

library(dplyr)
library(tidyr)

df %>% 
   separate_rows(stock) %>% 
   group_by(id) %>% 
   mutate(name = paste("stock", seq_along(stock), sep = "_")) %>% 
   ungroup %>% 
   spread(name, stock)

giving:

# A tibble: 3 x 3
     id   stock_1 stock_2
* <dbl>     <chr>   <chr>
1     1    Google   Yahoo
2     2 Microsoft  Google
3     3     Yahoo    <NA>

2) separate If we knew that there were no more than 2 sub-fields then we could use separate giving the same.

library(dplyr)
library(tidyr)

df %>%
   separate(stock, c("stock_1", "stock_2"), fill = "right")

3) read.table This approach uses no packages.

stocks <- read.table(text = as.character(df$stock), sep = ";", as.is = TRUE, fill = TRUE)
names(stocks) <- paste("stock", seq_along(stocks), sep = "_")
cbind(df[1], stocks)

giving:

  id   stock_1 stock_2
1  1    Google   Yahoo
2  2 Microsoft  Google
3  3     Yahoo

score 2 · Answer 2 · answered Dec 26 '17 at 14:35

Comment to answer, plus another option for completeness:

splitstackshape

library(splitstackshape)
cSplit(df, "stock", ";")
#    id   stock_1 stock_2
# 1:  1    Google   Yahoo
# 2:  2 Microsoft  Google
# 3:  3     Yahoo      NA

data.table

library(data.table)
setDT(df)[, c("stock_1", "stock_2") := tstrsplit(stock, ";")][, stock := NULL][]
#    id   stock_1 stock_2
# 1:  1    Google   Yahoo
# 2:  2 Microsoft  Google
# 3:  3     Yahoo      NA

Split by separator into many columns

2 Answers2

splitstackshape

data.table