2

I am trying to split a column into two separate ones when the divider is a dot.

Following this example, I have tried first:

library(tidyverse)

df1 <- data.frame(Sequence.Name= c(2.1, 2.1, 2.1),
                  var1 = c(1, 1, 0))
df1$Sequence.Name %>% 
  as.character %>%
  str_split_fixed(".",2) %>%
  head
#>      [,1] [,2]
#> [1,] ""   ".1"
#> [2,] ""   ".1"
#> [3,] ""   ".1"

Created on 2021-04-05 by the reprex package (v0.3.0)

But this is not what I want: the first column is empty, and the second one has still the dot.

Following the comments in the post I linked above, I tried to add fixed="." or fixed=TRUE, but it does not seem to work:

library(tidyverse)

df1 <- data.frame(Sequence.Name= c(2.1, 2.1, 2.1),
                  var1 = c(1, 1, 0))
df1$Sequence.Name %>% 
  as.character %>%
  str_split_fixed(".",fixed=".",2) %>%
  head
#> Error in str_split_fixed(., ".", fixed = ".", 2): unused argument (fixed = ".")

Created on 2021-04-05 by the reprex package (v0.3.0)

Emy
  • 817
  • 1
  • 8
  • 25

4 Answers4

3

something like this?

df1 %>% separate(Sequence.Name, into = c("Col1", "Col2"))

  Col1 Col2 var1
1    2    1    1
2    2    1    1
3    2    1    0
AnilGoyal
  • 25,297
  • 4
  • 27
  • 45
3

It is also possible to do this in base R with read.table

cbind(read.table(text = as.character(df1$Sequence.Name), sep=".", 
         header = FALSE, col.names = c("Col1", "Col2")), df1['var1'])
#  Col1 Col2 var1
#1    2    1    1
#2    2    1    1
#3    2    1    0
akrun
  • 874,273
  • 37
  • 540
  • 662
2

Here is a data.table option using tstrsplit

> setDT(df1)[, c(lapply(tstrsplit(Sequence.Name, "\\."), as.numeric), .(var1))]
   V1 V2 V3
1:  2  1  1
2:  2  1  1
3:  2  1  0
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
2

This one also helps:

library(tidyr)

df1 %>%
  extract(col = Sequence.Name, into = c("Sequence", "Name"), regex = "(.).(.)")

  Sequence Name var1
1        2    1    1
2        2    1    1
3        2    1    0

Anoushiravan R
  • 21,622
  • 3
  • 18
  • 41