How to split a column when there is a weird/no space

Question

Here is the head():

Here is the View():

I want to split Ensemblnames and Gene names into 2 different columns. I don´t really understand how I can let it split by "\" or if there are other options to split it.

What I tried

df i<-
  tidyr::separate(
    data = df,
    col = Ensemblnames,
    sep = " \ ",
    into = colmn,
    remove = FALSE)

Error in UseMethod("separate") : 
  no applicable method for 'separate' applied to an object of class "function"

It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Please [do not post code or data in images](https://meta.stackoverflow.com/q/285551/2372064) — MrFlick, Nov 29 '22 at 19:13
It seems like your data is name `Ensemblnames`, not `df`. Make sure to use the right name in `separate(data=)` — MrFlick, Nov 29 '22 at 19:14
In the `head` view, we can see `\t` which is a tab character. Backslashes need escaping so `sep = "\\t"` would probably work, as should the default if you don't specify `sep` at all. But probably your data was read in incorrectly--rather than separating it now you could go back to the command you used to read in the data and use a tab separator (the default for `readr::read_tsv`) and then it would be correct from the start. — Gregor Thomas, Nov 29 '22 at 19:27

score 1 · Answer 1 · answered Nov 29 '22 at 20:26

is that what you need?

library(dplyr)
library(tidyr)

Data

df <- tibble::tribble(
           ~gene_id.gene_name,
   "ENSG00000000003\tTSPAN6",
      "ENSG00000000005\tTND",
     "ENSG00000000419\tDPM1",
     "ENSG00000000457\tSCYL3",
  "ENSG00000000460\tc1orf112",
       "ENSG00000000938\tFGR"
  )

solution


df %>% 
  separate(col = gene_id.gene_name, into = c("a", "b"), sep = "\t")

rename "a" and "b" for the new column names you want.

output

#> # A tibble: 6 × 2
#>   a               b       
#>   <chr>           <chr>   
#> 1 ENSG00000000003 TSPAN6  
#> 2 ENSG00000000005 TND     
#> 3 ENSG00000000419 DPM1    
#> 4 ENSG00000000457 SCYL3   
#> 5 ENSG00000000460 c1orf112
#> 6 ENSG00000000938 FGR

^{Created on 2022-11-29 with reprex v2.0.2}

How to split a column when there is a weird/no space

1 Answers1

Data

solution

output