0

For example I have data('anscombe.txt'):

 case x1 x2 x3 x4    y1   y2    y3    y4
 1:    1 10 10 10  8  8.04 9.14  7.46  6.58
 2:    2  8  8  8  8  6.95 8.14  6.77  5.76
 3:    3 13 13 13  8  7.58 8.74 12.74  7.71
 4:    4  9  9  9  8  8.81 8.77  7.11  8.84
 5:    5 11 11 11  8  8.33 9.26  7.81  8.47
 6:    6 14 14 14  8  9.96 8.10  8.84  7.04
 7:    7  6  6  6  8  7.24 6.13  6.08  5.25
 8:    8  4  4  4 19  4.26 3.10  5.39 12.50
 9:    9 12 12 12  8 10.84 9.13  8.15  5.56
10:   10  7  7  7  8  4.82 7.26  6.42  7.91
11:   11  5  5  5  8  5.68 4.74  5.73  6.89

I want to reshape them in the following way:

    x1   variable  value
 1: 10       y1  8.04
 2:  8       y1  6.95
 3: 13       y1  7.58
 4:  9       y1  8.81
 5: 11       y1  8.33
 6: 14       y1  9.96
 7:  6       y1  7.24
 8:  4       y1  4.26
 9: 12       y1 10.84
  ...         ......
34:  8       y4  6.58
35:  8       y4  5.76
36:  8       y4  7.71
37:  8       y4  8.84
38:  8       y4  8.47
39:  8       y4  7.04
40:  19      y4  5.25
41:  8       y4 12.50
42:  8       y4  5.56
43:  8       y4  7.91
44:  8       y4  6.89

So, that column x1 is equal to c(x1,x2,x3,x4) and value is equal to c(y1,y2,y3,y4)

How can i do this with melt function(or some equivalent in data.table package)?

I tried this: ansc_new <- melt(anscombe,id.vars = 'x1',measure.vars = c('y1','y2','y3','y4'))

But last 11 values of x1 are not want I want to see.

Mat_nekras
  • 81
  • 6
  • Does it have to be from data.table? –  Mar 20 '17 at 21:45
  • @JulianCienfuegos ideally yes. But if there is no oppurtunity to keep it as data.table - than other variants are also acceptable – Mat_nekras Mar 20 '17 at 21:50
  • Is there a reason for the separate `x1` column? Depending on what you plan to do next, it might be easier (and more usual) to reshape with just 2 columns, so x1-x4 would also be in the `variable` column. – neilfws Mar 20 '17 at 22:02
  • If I understand correctly, you can specify measure.vars differently: `melt(DT, id="case", meas=patterns("x", "y"))` You can get rid of the extra cols you don't want; and can also use variable.name to assign different col names. – Frank Mar 20 '17 at 22:28

2 Answers2

0

Here's a dplyr and tidyr solution, which gathers the x and y variable separately then binds them back together. I'd consider whether you need a separate x1 column: it might be easier to simply gather both x and y labels into the variable column.

library(dplyr)
library(tidyr)

anscombe %>% 
  gather(variable.x, value, -y1, -y2, -y3, -y4) %>% 
  select(variable.x = value) %>% 
  bind_cols(gather(anscombe, variable, value, -x1, -x2, -x3, -x4)) %>%
  select(x1 = variable.x, variable, value)
neilfws
  • 32,751
  • 5
  • 50
  • 63
0

I believe you can get what you want using lapply and rbindlist from data.table:

l<-lapply(1:4,function(z){
  melt(anscombe,id.vars = paste0("x",z),,measure.vars = paste0("y",z))
})
rbindlist(l)
Pdubbs
  • 1,967
  • 2
  • 11
  • 20