-1

I am new to bioinformatics and R. I have a data frame that has three columns and looks like this:

                          name   X   Y
1                         4052 153 302
2                         7057  80 279
3                         8454 466 266
4                         9978 466 249
5          3397 3398 3399 3400 769 142
6                    1874 1875 723 325

Now in the name column these are gene IDs and for the fifth row there are 4 gene ids together. I want to separate them, add them to the dataframe as a separate row with the same X and Y. same is the case with row 6. So i want the output to be like this

                      name   X   Y
1                         4052 153 302
2                         7057  80 279
3                         8454 466 266
4                         9978 466 249
5                         3400 769 142
6                         1875 723 325
7                         3399 769 142
8                         3398 769 142
9                         3397 769 142
10                        1874 723 325
Cœur
  • 37,241
  • 25
  • 195
  • 267
Saad
  • 381
  • 2
  • 6
  • 12

1 Answers1

0

using package tidyr separate_rows you have a really simple solution :

replicate data :

data <- data.frame(name = c("4052", "7057","8454","9978","3397 3398 3399 3400","1874 1875"),
                   X = c(153,80,466,466,768,723),
                   Y = c(302,279,266,249,142,325), stringsAsFactors = FALSE)

result :

                 name   X   Y
1                4052 153 302
2                7057  80 279
3                8454 466 266
4                9978 466 249
5 3397 3398 3399 3400 768 142
6           1874 1875 723 325

now the magic :) :

    library(tidyr)

    separate_rows(data, name, convert = TRUE)

result :

     X   Y name
1  153 302 4052
2   80 279 7057
3  466 266 8454
4  466 249 9978
5  768 142 3397
6  768 142 3398
7  768 142 3399
8  768 142 3400
9  723 325 1874
10 723 325 1875
Menelith
  • 521
  • 2
  • 4
  • 13