R - Split one column into two when the divider is a dot

Question

I am trying to split a column into two separate ones when the divider is a dot.

Following this example, I have tried first:

library(tidyverse)

df1 <- data.frame(Sequence.Name= c(2.1, 2.1, 2.1),
                  var1 = c(1, 1, 0))
df1$Sequence.Name %>% 
  as.character %>%
  str_split_fixed(".",2) %>%
  head
#>      [,1] [,2]
#> [1,] ""   ".1"
#> [2,] ""   ".1"
#> [3,] ""   ".1"

^{Created on 2021-04-05 by the reprex package (v0.3.0)}

But this is not what I want: the first column is empty, and the second one has still the dot.

Following the comments in the post I linked above, I tried to add fixed="." or fixed=TRUE, but it does not seem to work:

library(tidyverse)

df1 <- data.frame(Sequence.Name= c(2.1, 2.1, 2.1),
                  var1 = c(1, 1, 0))
df1$Sequence.Name %>% 
  as.character %>%
  str_split_fixed(".",fixed=".",2) %>%
  head
#> Error in str_split_fixed(., ".", fixed = ".", 2): unused argument (fixed = ".")

^{Created on 2021-04-05 by the reprex package (v0.3.0)}

score 3 · Accepted Answer · answered Apr 05 '21 at 13:18

3

something like this?

df1 %>% separate(Sequence.Name, into = c("Col1", "Col2"))

  Col1 Col2 var1
1    2    1    1
2    2    1    1
3    2    1    0

answered Apr 05 '21 at 13:18

AnilGoyal

25,297
4
27
45

score 3 · Answer 2 · answered Apr 05 '21 at 16:06

3

It is also possible to do this in base R with read.table

cbind(read.table(text = as.character(df1$Sequence.Name), sep=".", 
         header = FALSE, col.names = c("Col1", "Col2")), df1['var1'])
#  Col1 Col2 var1
#1    2    1    1
#2    2    1    1
#3    2    1    0

answered Apr 05 '21 at 16:06

akrun

874,273
37
540
662

1

What a smart approach using `read.table`! – ThomasIsCoding Apr 05 '21 at 22:12

score 2 · Answer 3 · answered Apr 05 '21 at 22:11

2

Here is a data.table option using tstrsplit

> setDT(df1)[, c(lapply(tstrsplit(Sequence.Name, "\\."), as.numeric), .(var1))]
   V1 V2 V3
1:  2  1  1
2:  2  1  1
3:  2  1  0

answered Apr 05 '21 at 22:11

ThomasIsCoding

96,636
9
24
81

score 2 · Answer 4 · answered Apr 05 '21 at 22:20

2

This one also helps:

library(tidyr)

df1 %>%
  extract(col = Sequence.Name, into = c("Sequence", "Name"), regex = "(.).(.)")

  Sequence Name var1
1        2    1    1
2        2    1    1
3        2    1    0

answered Apr 05 '21 at 22:20

Anoushiravan R

21,622
3
18
41

R - Split one column into two when the divider is a dot

4 Answers4