Data manipulation

Question

my data format looks like this.

order_id    Prod1   prod2   Prod3   Prod4   Prod5
   A             1     0     1      1     1
   B             0     0     1      1     0
   C             1     1     0      1     1

I want to transform this table in a manner such that All order IDs having Products as "1" should be next to each other under a new var called Product i.e.

order_id   Prod
A          Prod1
A          Prod3
A          Prod4
A          Prod5
B          Prod3
B          Prod4

and so on. I am trying to program it in R. Any help in this regard would be appreciated

score 1 · Accepted Answer · answered Jan 22 '16 at 10:50

We melt the data to 'long' format, subset the rows where 'value' is 1, and then order based on the 'order_id'.

library(reshape2)
d1 <- subset(melt(df1, id.var='order_id'), value!=0, select=1:2)
d1[order(d1$order_id),]
#   order_id variable
#1         A    Prod1
#7         A    Prod3
#10        A    Prod4
#13        A    Prod5
#8         B    Prod3
#11        B    Prod4
#3         C    Prod1
#6         C    prod2
#12        C    Prod4
#15        C    Prod5

Jaap · Answer 2 · 2018-06-02T09:04:43.750

Two alternative solutions:

1: using dplyr and tidyr

library(dplyr)
library(tidyr)
mydf %<>% gather(var, val, -1) %>% filter(val!=0) %>% select(-3) %>% arrange(order_id)

2: using data.table

library(data.table)
mydf <- melt(setDT(mydf), 1)[value!=0][, value := NULL][order(order_id)]

which both give:

> mydf
    order_id variable
 1:        A    Prod1
 2:        A    Prod3
 3:        A    Prod4
 4:        A    Prod5
 5:        B    Prod3
 6:        B    Prod4
 7:        C    Prod1
 8:        C    prod2
 9:        C    Prod4
10:        C    Prod5

Data manipulation

2 Answers2