0

This seems like a basic question so please feel free to point me to another answer but I can't find an answer through my searching.

I have a set of 180,000 rows that looks like this:

df <- c("12hfgog|hcsg9ws|xaw_07cas", "fhjf79", "8xxghk")

I want to split the string at the delimiter "|" and create a new dataframe with the results that looks like this:

df2 <- c("12hfgog","hcsg9ws", "xaw_07cas", "fhjf79", "8xxghk")

I know it involves some combination of strsplit, unlist and unnest but I can't quite get it right. Any help appreciated!

rawr
  • 20,481
  • 4
  • 44
  • 78
Oliver
  • 274
  • 1
  • 11

2 Answers2

1

My suggestion is using sapply and strsplit. Later is only convert the resultant list to vector with unlist.

df1 <- c("12hfgog|hcsg9ws|xaw_07cas", "fhjf79", "8xxghk")
df2 <- unlist(sapply(df1, strsplit, split = "\\|", USE.NAMES = FALSE))

Regards.

Eder
  • 26
  • 1
  • 1
    `strsplit` is vectorized so you can skip `sapply` and as per @GordonShumway's comment (i.e. `unlist(strsplit(df, "\\|"))`). – ngwalton Dec 24 '19 at 20:53
0

We can use separate_rows

library(tidyr)
df %>% 
   separate_rows(colname)
akrun
  • 874,273
  • 37
  • 540
  • 662