1

I have a matrix that looks like this:

SNP     G1      G2      G3
marker1 TT      CC      TT
marker2 AA      AA      AA
marker3 TT      TT      TT 

And I would like it to look like this :

SNP     
>marker1    TT  G1
>marker2    AA  G1
>marker3    TT  G1
>marker1    CC  G2
>marker2    AA  G2
>marker3    TT  G2
>marker1    TT  G3
>marker2    AA  G3
>marker3    TT  G3

I am using this:

        bsp2<- read.table("C:/R/bsp2.csv", header=TRUE) 

       reshape(as.data.frame(bsp2), direction="long", varying = list(colnames(bsp2)
       [2:6]), v.names="G", idvar="SNP")

But I am getting the error message "undefined columns selected". Can anyone tell me what I am doing wrong?

marie
  • 223
  • 2
  • 4
  • 9
  • `bsp2` has only 4 columns, so `colnames(bsp2)[2:6]` generates `"G1" "G2" "G3" NA NA`, which in turn causes the error. – bdemarest May 02 '12 at 17:26
  • Similar question [here](http://stackoverflow.com/q/10234734/210673). Future searchers may also be interested in the reverse; see [this question](http://stackoverflow.com/q/9617348/210673) – Aaron left Stack Overflow May 02 '12 at 17:43

2 Answers2

5

This will be much easier using melt from reshape2:

dat <- read.table(text = "SNP     G1      G2      G3
marker1 TT      CC      TT
marker2 AA      AA      AA
marker3 TT      TT      TT",header = T,sep = "")

require(reshape2)
melt(dat,id.var = "SNP")

      SNP variable value
1 marker1       G1    TT
2 marker2       G1    AA
3 marker3       G1    TT
4 marker1       G2    CC
5 marker2       G2    AA
6 marker3       G2    TT
7 marker1       G3    TT
8 marker2       G3    AA
9 marker3       G3    TT
joran
  • 169,992
  • 32
  • 429
  • 468
4

Here it is with reshape in base though joran is right melt is likely easier.

bsp2 <- read.table(text="SNP     G1      G2      G3
marker1 TT      CC      TT
marker2 AA      AA      AA
marker3 TT      TT      TT ", header=TRUE)

bsp2.long <- reshape(bsp2, direction="long", varying = 2:4, v.names="G", 
    timevar="TIME", times=paste0("G", 1:3), idvar="SNP")

rownames(bsp2.long) <- seq_len(nrow(bsp2.long))
bsp2.long

Which yields:

      SNP TIME  G
1 marker1   G1 TT
2 marker2   G1 AA
3 marker3   G1 TT
4 marker1   G2 CC
5 marker2   G2 AA
6 marker3   G2 TT
7 marker1   G3 TT
8 marker2   G3 AA
9 marker3   G3 TT

Note you need R 2,15 for this to work as I used paste0. If you don't have R2.15 and don't want to install it replace that argument with times=c("G1", "G2", "G3"). Also what I called TIME was not necessary as R would have called it time but I did so to show you have control over that name with reshape.

Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • 1
    +1 for reminding me why I've never bothered to really learn how to use `reshape`. Ugh. – joran May 02 '12 at 17:32