0

I have a data frame in R with several columns. In one column, I want to use strsplit to extract part of the string and put it into another column. I have a data frame with a column called IDName, the IDName has strings in this format:

1-John
2-Tom

and I want to split the string and put ID in its own column and name in its own column

InputDF%>% mutate(ID=strsplit(IDName, "-"))-> OutputDF

this doesn't put ID in the ID column, but it is a list, how can I extract ID and Name using the above code?

I tried this:

  InputDF%>%  mutate(ID=strsplit(IDName, "-")[[0]])-> OutputDF

But I am getting errors.

What is the best way to do this?

mans
  • 17,104
  • 45
  • 172
  • 321

1 Answers1

3

We could use separate function from tidyr package:

library(dplyr)
library(tidyr)

df %>% 
  separate(x, c("ID", "name"), sep = '-')

output:

  ID    name 
  <chr> <chr>
1 1     John 
2 2     Tom  

data:

df <- tribble(
  ~x,
"1-John",
"2-Tom")
TarJae
  • 72,363
  • 6
  • 19
  • 66
  • Thanks, that answers my question, but what if I want to do it using strsplit? is there any way that I can do it strstplit function? – mans Mar 08 '22 at 17:21
  • @akrun Any way to do this with tidyverse? Also what if I only need one part of the split for example only I want to have an ID and not a Name? – mans Mar 08 '22 at 17:28
  • 3
    @mans you may use `df %>% mutate(out = strsplit(x, "-")) %>% unnest_wider(out)` But, in my opinion, `separate` is better because you can also use `convert = TRUE` to change the column types automatically along with passing the column names of the split columns. Efficiency wise, `strsplit` may be faster (not checked) – akrun Mar 08 '22 at 17:41