0
I have a data.frame that looks like:
Input_SNP       Set_1           Set_2           Set_3           
4:184648954     18:71883827     7:135798891     7:91206783    
13:34371442     14:52254555     1:223293324     7:54912662     
18:71883393     22:50428069     7:138698825     8:97486210     

I would like to extra the number before the colon and make it a new column called CHR for each of these columns. I would like to extract the number after the colon and make it a new column called BP. To make this clearer, this is what desired output would be:

    Input_SNP_CHR   Input_SNP_BP     Set_1_CHR   Set_1_BP     Set_2_CHR   Set_2_BP     Set_3_CHR   Set_3_BP
    4                 184648954           18     71883827       7         135798891      7        91206783  
    13                34371442            14     52254555       1         223293324      7        54912662
    18                71883393            22     50428069       7         138698825      8        97486210

Thus, I would like to start with N columns and finish with 2N columns. How can I do this? Should I use grep?

Evan
  • 1,477
  • 1
  • 17
  • 34

1 Answers1

2

Here's one way to do it:

library(splitstackshape)
df <- read.table(header=T, text="Input_SNP       Set_1           Set_2           Set_3           
4:184648954     18:71883827     7:135798891     7:91206783    
13:34371442     14:52254555     1:223293324     7:54912662     
18:71883393     22:50428069     7:138698825     8:97486210")
df <- cSplit(df, 1:4, ":")
names(df) <- paste0(sub("(.*_).*", "\\1", names(df)), c("CHR", "BP"))
df
#    Input_SNP_CHR Input_SNP_BP Set_1_CHR Set_1_BP Set_2_CHR  Set_2_BP Set_3_CHR Set_3_BP
# 1:             4    184648954        18 71883827         7 135798891         7 91206783
# 2:            13     34371442        14 52254555         1 223293324         7 54912662
# 3:            18     71883393        22 50428069         7 138698825         8 97486210
lukeA
  • 53,097
  • 5
  • 97
  • 100
  • What does 1:4 do? If I had 10,000 columns and row would those numbers change? – Evan Dec 05 '15 at 20:17
  • `1:4` selects columns 1 to 4 for splitting. So the selection would change accordingly. – lukeA Dec 05 '15 at 20:20
  • This worked, but what would happen if one of the cells didn't have a colon, would it make the cell an NA? – Evan Dec 07 '15 at 19:29
  • 1
    Well, you can try it out with the example above - it would create a `NA` column, yes. – lukeA Dec 07 '15 at 19:56