1

I have a dataframe that looks like this:

df <- data.frame(text=c('my_text', 'looks_like_this', 'I_want_to_split_it'))

I want to use some kind of dplyr family function to make a dataframe that looks like this:

newdf <- data.frame(text=c('my_text', 'looks_like_this', 'I_want_to_split_it'),
                    W1=c('my', 'looks', 'I'),
                    W2=c('text', 'like', 'want'),
                    W3=c(NA, 'this', 'to'),
                    W4=c(NA, NA, 'split'),
                    W5=c(NA, NA, 'it'))

I'm thinking something like this:

newdf <- df %>%
  mutate(WX=strplit(text, '_'))

But cant quite figure it out.

Amadou Kone
  • 907
  • 11
  • 21

1 Answers1

4

We can use strsplit and then do NA padding at the end

lst1 <- strsplit(as.character(df$text), "_")
out <- do.call(rbind.data.frame, lapply(lst1, `length<-`, max(lengths(lst1))))
names(out) <- paste0("W", seq_along(out))
cbind(df, out)

Or another option is read.table

cbind(df, read.table(text = as.character(df$text), sep="_", header = FALSE,
      fill = TRUE, col.names = paste0("W", 1:5)))

With tidyverse, we can use separate

library(dplyr)
library(tidyr)
library(stringr)
df %>%
      separate(text, into = str_c("W", 1:5), fill = 'right', remove = FALSE)
#               text    W1   W2   W3    W4   W5
#1            my_text    my text <NA>  <NA> <NA>
#2    looks_like_this looks like this  <NA> <NA>
#3 I_want_to_split_it     I want   to split   it

Or after doing the strsplit based on the OP's code, use unnest_wider

df %>%
   mutate(WX = strsplit(as.character(text), "_")) %>% 
   unnest_wider(WX, names_repair = ~c('text', str_c("W", 1:5)))

Or using cSplit

library(splitstackshape)
cSplit(df, "text", "_")
akrun
  • 874,273
  • 37
  • 540
  • 662