Match and Fill Values in R

Question

I have a data set containing 3 columns. First column contains Products Name (A through E) and corresponding 2 columns contain nearest 2 neighbors (i.e customers who own Product specified in column A are more likely to buy the next best 2 products (nearest 2 neighbors).

m1 = data.frame(Product=c("A","B","C","D","E"), V1=c("C","A","A","A","D"), 
                V2=c("D","D","B","E","A"))

In the second data set, i have data at user level. First column contains User IDs and corresponding 5 columns contain information whether user own the product or not. 1 - Own it. 0 - Don't own it.

m2 = data.frame(ID = c(1:7), A = rbinom(7,1,1/2), B = rbinom(7,1,1/2), 
                C = rbinom(7,1,1/2), D = rbinom(7,1,1/2), E = rbinom(7,1,1/2))

I want product recommendation at user level. I want m1 data to be merged with m2 based on the user own it or not. The output should look like -

User - 1 A D

Please use `set.seed` to make your input reproducible and then show the complete output expected from the inputs. — G. Grothendieck, Dec 25 '15 at 15:36

score 0 · Accepted Answer · edited May 23 '17 at 11:52

You haven't posted reproducible example and exact expected results, but this seems to do what you want.

set.seed(321)
m1 = data.frame(Product=c("A","B","C","D","E"), V1=c("C","A","A","A","D"), 
                V2=c("D","D","B","E","A"))
m2 = data.frame(ID = c(1:7), A = rbinom(7,1,1/2), B = rbinom(7,1,1/2), 
                C = rbinom(7,1,1/2), D = rbinom(7,1,1/2), E = rbinom(7,1,1/2))

recommended <- apply(m2, 1, function(x) {
  client.recommended <- m1[as.logical(x[-1]),-1]
  top <- names(sort(table(as.vector(t(client.recommended))),
                    decreasing = TRUE)[1:2])
  c(x[1], top)
})

recommended <- as.data.frame(t(recommended), stringsAsFactors = FALSE)

  ID V2 V3
1  1  A  B
2  2  A  D
3  3  A  B
4  4  A  D
5  5  A  D
6  6  A  D
7  7  A  B

What this code does:

For every row in m2 data.frame (every client), take that row
Take subset of m1 data.frame corresponding to values found in row (if client chosen "A" and "B", take rows "A" and "B" from m1
Turn this subset into vector
Count occurrences of unique values in vector
Sort unique values by count
Take first most common unique values
Return these values along with client ID
Turn everything into proper data.frame for further processing

It seems that you expect to obtain only two products for each client and that is what this code does. For products with the same number of occurrences, apparently one that comes first alphabetically wins. You can get all recommended product by dropping [1:2] part, but then you will need to figure out how to coerce uneven-length vectors into single data.frame.

Match and Fill Values in R

1 Answers1