1

I have the following data set D7

name sex_age eye_color height
1    J    M.34     Other     61
2    A    F.55      Blue     59
3    T    M.76     Brown     51
4    D    F.19     Other     57

I want to separate the column sex_age into sex column and age column, so I type

separate(D7,sex_age,c('sex','age'),sep='.')

But it generates

name sex age eye_color height
1    J             Other     61
2    A              Blue     59
3    T             Brown     51
4    D             Other     57
Warning message:
Too many values at 4 locations: 1, 2, 3, 4 

Also, when I modify my original data set D7 into D8

name sex_age eye_color height
1    J    M_34     Other     61
2    A    F_55      Blue     59
3    T    M_76     Brown     51
4    D    F_19     Other     57

And I type D7 %>% separate(sex_age,c('sex','age'),sep="_") it gives

name  sex  age eye_color height
1    J M.34 <NA>     Other     61
2    A F.55 <NA>      Blue     59
3    T M.76 <NA>     Brown     51
4    D F.19 <NA>     Other     57
Warning message:
Too few values at 4 locations: 1, 2, 3, 4 

Did I misuse the separate function? I am very puzzled. Thank you for any suggestions.

KevinKim
  • 1,382
  • 3
  • 18
  • 34

1 Answers1

4

since sep= argument considers regex and . is a special character, therefore we need to have \\ before such special characters such that they are read as normal characters

separate(df, sex_age, into = c("sex", "age"), sep = "\\.")
joel.wilson
  • 8,243
  • 5
  • 28
  • 48