0

(sample code below) I have two data sets. One is a library of products, the other is customer id, date and viewed product and another detail.I want to get a merge where I see per each id AND date all the library of products as well as where the match was. I have tried using full_join and merge and right and left joins, but they do not repeat the rows. below is the sample of what i am trying to achieve.

id=c(1,1,1,1,2,2)
 date=c(1,1,2,2,1,3)
 offer=c('a','x','y','x','y','a')
 section=c('general','kitchen','general','general','general','kitchen')
 t=data.frame(id,date,offer,section)

offer=c('a','x','y','z')
 library=data.frame(offer)
######
t table
  id date offer section
1  1    1     a general
2  1    1     x kitchen
3  1    2     y general
4  1    2     x general
5  2    1     y general
6  2    3     a kitchen

library table
  offer
1     a
2     x
3     y
4     z

and i want to get this:

  id date offer section
1  1    1     a general
2  1    1     x kitchen
3  1    1     y  NA
4  1    1     z general
...

(there would have to be 6*4 observations) I realize because I match by offer it is not going to repeat the values like so, but what is another option to do that? Thanks a lot!!

tchinko
  • 5
  • 2

2 Answers2

0

You can use tidyr and dplyr to get the data. The crossing() function will create all combinations of the variables you pass in

library(dplyr)
library(tidyr)
t %>% 
  select(id, date) %>% 
  {crossing(id=.$id, date=.$date, library)} %>% 
  left_join(t)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
0

You can use complete to get all combinations of library$offer for each id and date.

tidyr::complete(t, id, date, offer = library$offer)

# A tibble: 24 x 4
#      id  date offer section
#   <dbl> <dbl> <chr> <chr>  
# 1     1     1 a     general
# 2     1     1 x     kitchen
# 3     1     1 y     NA     
# 4     1     1 z     NA     
# 5     1     2 a     NA     
# 6     1     2 x     general
# 7     1     2 y     general
# 8     1     2 z     NA     
# 9     1     3 a     NA     
#10     1     3 x     NA     
# … with 14 more rows
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213