-2

I would like to reshape the following data.table

library(data.table)
  myfun <- function()  sample(c(NA,round(runif(9)*10)),prob=c(0.2,rep(0.1,9)))
  cheeze <-  myfun()
  bottle <-  myfun()

  df <-  as.data.table(data.frame(ID=LETTERS[1:10],
     bottle_qty=bottle,
     bottle_price=bottle*c(1,3,5),
     cheeze_qty=cheeze,
     cheeze_price=cheeze*c(5,4,2),
     cheeze_cam = 1*(cheeze>4) ,
     cheeze_brie = 1*(cheeze<=4),
     bottle_wine = 1*(bottle>5),
     bottle_beer = 1*(bottle<=5))
     )
 #  ID bottle_qty bottle_price cheeze_qty cheeze_price cheeze_cam cheeze_brie
#  1:  A          7            7          9           45          1           0
#  2:  B          4           12          6           24          1           0
#  3:  C         NA           NA         NA           NA         NA          NA
#  4:  D          7            7          2           10          0           1
#  5:  E          3            9          9           36          1           0
#  6:  F          9           45          4            8          0           1
#  7:  G          6            6          3           15          0           1
#  8:  H          2            6          6           24          1           0
#  9:  I          5           25          8           16          1           0
# 10:  J          7            7          3           15          0           1
#   bottle_wine bottle_beer
#  1:           1           0
#  2:           0           1
#  3:          NA          NA
#  4:           1           0
#  5:           0           1
#  6:           1           0
#  7:           1           0
#  8:           0           1
#  9:           0           1
# 10:           1           0

into the following:

| ID | type        | qty | price |
| A  | cheeze_cam  |   9 |    45 |
| A  | bottle_wine |   7 |     7 |
| B  | bottle_beer |   4 |    12 |
| B  | cheeze_cam  |   6 |    24 |

Edit That is the full expected output.

| ID | type        | qty | price |
|----+-------------+-----+-------|
| A  | bottle_wine |   7 |     7 |
| A  | cheeze_cam  |   9 |    45 |
| B  | bottle_beer |   4 |    12 |
| B  | cheeze_cam  |   6 |    24 |
| C  | bottle_wine |  NA |    NA |
| C  | cheeze_brie |  NA |    NA |
| D  | bottle_wine |   7 |     7 |
| D  | cheeze_brie |   2 |    10 |
| E  | bottle_beer |   3 |     9 |
| E  | cheeze_cam  |   9 |    36 |
| F  | bottle_wine |   9 |    45 |
| F  | cheeze_brie |   4 |     8 |
| G  | bottle_wine |   6 |     6 |
| G  | cheeze_brie |   3 |    15 |
| H  | bottle_beer |   2 |     6 |
| H  | cheeze_cam  |   6 |    24 |
| I  | bottle_beer |   5 |    25 |
| I  | cheeze_cam  |   8 |    16 |
| J  | bottle_wine |   7 |     7 |
| J  | cheeze_brie |   3 |    15 |

But the x object is not found. Any help please?

DJJ
  • 2,481
  • 2
  • 28
  • 53
  • 1
    Try `melt(melt(df, measure=patterns("qty$", "price$"), value.name=c('qty', 'price'), variable.name="var", na.rm=TRUE), id.var=c('ID','var', 'qty', 'price'), na.rm=TRUE)[order(ID)]` – akrun Mar 05 '16 at 16:23
  • Nice try thanks. Your initial suggestion was quite interesting actually. I built on it to produce this `melt(df, id.var="ID",measure=patterns("cheeze_qty$", "cheeze_price$"), na.rm=TRUE)`. But the lapply doesn't seem to work right away – DJJ Mar 05 '16 at 16:33
  • what is not working in @akrun solution? I don't understant from where the `lapply` problem comes from... Could you precise your question? – cderv Mar 05 '16 at 16:45
  • @Akrun solution is not quite right as it gives duplicates, that cannot be removed automatically – DJJ Mar 05 '16 at 16:52
  • Can you post the full expected output? Perhaps `unique(dM, by = c("ID", "qty", "price"))` if `dM <- melt(melt(...` – akrun Mar 05 '16 at 16:58
  • @akrun I've posted the expected output. Thanks for the suggestion. – DJJ Mar 05 '16 at 17:20
  • @akrun you should that as a solution imo – Jaap Mar 05 '16 at 17:20
  • 2
    @Akrun you were closer to the answer. I wrongly thought your results couldn't be refined. Here is what I was looking for. `melt(melt(df, measure=patterns("qty$", "price$"), value.name=c('qty', 'price'), variable.name="var", na.rm=TRUE), id.var=c('ID','var', 'qty', 'price'), na.rm=TRUE)[order(ID)][value==1,][like(variable,"cheeze")&var==1|like(variable,"bottle")&var==2,]`. Really nice solution. – DJJ Mar 05 '16 at 17:33
  • @DJJ You could post as an answer with the modification. – akrun Mar 05 '16 at 17:36

1 Answers1

1

Thanks everyone for help. The credits goes to @akrun. I'm only building on his suggestion.

The fist melt will stack the all the price and quantity columns without any other consideration. Therefore as we have two columns,one quantity and price. The number of rows the first melt should be twice the number of rows in original table. In the process we have the variable var that is created. var1 being the for cheeze and var2 to being for bottle.

Once the first melt is done the rest is straightforward. We only, need to melt the specifications. And use var to clean the table to get the desired specification.

melt(melt(df, measure=patterns("qty$", "price$"), value.name=c('qty', 'price'), variable.name="var", na.rm=TRUE), id.var=c('ID','var', 'qty', 'price'), na.rm=TRUE)[order(ID)][value==1,][like(variable,"cheeze")&var==1|like(variable,"‌​bottle")&var==2,]

##    ID var qty price    variable value
## 1:  A   1   6     6 cheeze_brie     1
## 2:  B   1   8    24  cheeze_cam     1
## 3:  C   1   1     5 cheeze_brie     1
## 4:  D   1   5     5  cheeze_cam     1
## 5:  E   1   4    12  cheeze_cam     1
## 6:  H   1   1     3  cheeze_cam     1
## 7:  I   1   9    45 cheeze_brie     1
## 8:  J   1   4     4 cheeze_brie     1
DJJ
  • 2,481
  • 2
  • 28
  • 53