applying str_split_fixed function simultaneously across multiple columns in r

Question

I have a data frame which is having multiple columns each column is having string values. I want to split column values by coma separator in the output data frame. Input and required output are as below

Col1=c("a,b,c","9,a,5")
Col2=c("c,b,e","4,r,t")
Col3=c("e,f,g","y,z,d")
Input=data.frame(Col1,Col2,Col3)

Column1=c("a","9")
Column2=c("b","a")
Column3=c("c","5")
Column4=c("c","4")
Column5=c("b","r")
Column6=c("e","t")
Column7=c("e","y")
Column8=c("f","z")
Column9=c("g","d")
Output=data.frame(Column1,Column2,Column3,Column4,Column5,Column6,Column7,Column8,Column9)

If anyone know the solution please help.

Sotos · Accepted Answer · 2022-05-20T12:21:15.350

1

The package splitstackshape is made for such operations,

library(splitstackshape)

cSplit(Input, names(Input), type.convert = 'as.is')

#   Col1_1 Col1_2 Col1_3 Col2_1 Col2_2 Col2_3 Col3_1 Col3_2 Col3_3
#1:      a      b      c      c      b      e      e      f      g
#2:      9      a      5      4      r      t      y      z      d

edited May 20 '22 at 12:21

answered May 20 '22 at 12:12

Sotos

51,121
6
32
66

1

Ya this works. Answer is useful. Thank you – ParthaSarathi May 20 '22 at 12:23
How do I do it If I have to split it before some character. – ParthaSarathi May 23 '22 at 06:57
There is an argument called `sep` (default is a comma) where you can define your delimiter , i.e. `cSplit(Input, names(Input), sep = 'whatever' ,type.convert = 'as.is')` – Sotos May 23 '22 at 07:03
It is like I have few values as comma and space in them I should avoid those and use only values without space to split into columns. ex: "a,b, c,d", the result would split into Column1=c("a") Column2=c("b, c") and Column3=c("d"). Or I can give separate into columns before some character like "P/N" – ParthaSarathi May 23 '22 at 07:08
You can replace all `, ` with `,` and split on that. Check my answer [here](https://stackoverflow.com/a/37305413/5635580) – Sotos May 23 '22 at 07:11
What if I have NA in any of the column value? – ParthaSarathi May 30 '22 at 06:36
You need to specify everything in your question and not ask additional stuff that change the whole approach in comments. – Sotos May 30 '22 at 06:51

score 0 · Answer 2 · answered May 20 '22 at 12:24

Sotos already provided a better suggestion, but here's an alternative dplyr/tidyr solution:

library(dplyr)
library(tidyr)
  
Input %>%
  mutate(id = row_number()) %>%
  pivot_longer(-id) %>%
  separate(value, c('1', '2', '3')) %>%
  pivot_wider(names_from=name, names_glue = "{name}_{.value}", values_from = '1':'3') %>%
  select(-id)

#> # A tibble: 2 × 9
#>   Col1_1 Col2_1 Col3_1 Col1_2 Col2_2 Col3_2 Col1_3 Col2_3 Col3_3
#>   <chr>  <chr>  <chr>  <chr>  <chr>  <chr>  <chr>  <chr>  <chr> 
#> 1 a      c      e      b      b      f      c      e      g     
#> 2 9      4      y      a      r      z      5      t      d

score 0 · Answer 3 · answered May 20 '22 at 15:29

0

With base R using read.csv

read.csv(text = do.call(paste, c(Input, sep = ",")), header = FALSE)
  V1 V2 V3 V4 V5 V6 V7 V8 V9
1  a  b  c  c  b  e  e  f  g
2  9  a  5  4  r  t  y  z  d

answered May 20 '22 at 15:29

akrun

874,273
37
540
662

applying str_split_fixed function simultaneously across multiple columns in r

3 Answers3