1

I have a question very similar to what discussed here: Split data frame string column into multiple columns However, given the following dataframe:

before = data.frame(attr = c(1,30,4,6), type=c('foo_and_bar','foo_and_bar_2'))
  attr          Name
1    1        George
2   30          Mark
3    4         Susan

I would need to split the "Name" column into multiples of 2 characters. The expected result should be:

  attr          Split1     Split2    Split3
1    1            Ge         or        ge
2   30                       Ma        rk
3    4            S          us        an

I have honestly now idea of how use the dplyr separate feature to achieve that. Thanks for the help.

Angelo
  • 1,594
  • 5
  • 17
  • 50

1 Answers1

3

We can use extract to capture groups with group position anchors at the end ($) of the string with two characters in the 2nd and 3rd group while the first can be flexible

library(stringr)
library(dplyr)
library(tidyr)
df1 %>% 
     extract(Name, into = str_c("Split", 1:3), "(.*?)(..)(..)$")
#   attr Split1 Split2 Split3
#1    1     Ge     or     ge
#2   30            Ma     rk
#3    4      S     us     an

or another option is to reverse the string and use the position as sep in separate

library(stringi)
df1 %>% 
  mutate(Name = stri_reverse(Name)) %>% 
  separate(Name, into = str_c("Split", 3:1), sep=c(2, 4)) %>%
  mutate(across(starts_with("Split"), stri_reverse)) %>%
  select(attr, Split1:Split3)

data

df1 <- structure(list(attr = c(1L, 30L, 4L), Name = c("George", "Mark", 
"Susan")), class = "data.frame", row.names = c("1", "2", "3"))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Thank you @akrun , I get an error saying "across" doesn't exist even after importing those liibraries. Any suggestions? By the way, would it be easier, if I wanted to capture the first two most right letters of each string? – Angelo Jun 25 '20 at 12:20
  • @Angelo I used `dplyr` 1.0.0 . If your version is before that, `mutate_at(vars(starts_with('Split')), stri_reverse)` – akrun Jun 25 '20 at 18:19