How to do grouping and inner join using dplyr

Question

I have the following data frame:

df<-data.frame(date=c("2019-01-06","2019-01-13","2019-01-20","2019-01-06","2019-01-13","2019-01-20","2019-01-06","2019-01-13","2019-01-20","2019-01-27"),
               version=c("A","A","A","B","B","B","C","C","C","C"),
               quantity=c(12,2,5,10,2,12,4,5,6,8),
               price=c(2.2,2.3,2.2,2.4,2.3,2.6,2.2,2.5,2.4,2.8))

df

         date version quantity price
1  2019-01-06       A       12   2.2
2  2019-01-13       A        2   2.3
3  2019-01-20       A        5   2.2
4  2019-01-06       B       10   2.4
5  2019-01-13       B        2   2.3
6  2019-01-20       B       12   2.6
7  2019-01-06       C        4   2.2
8  2019-01-13       C        5   2.5
9  2019-01-20       C        6   2.4
10 2019-01-27       C        8   2.8

I would like to group by version column and inner join the group (common across groups)) on date column to have the following filtered output:

      date      quantity_A price_A quantity_B price_B quantity_C price_C
1  2019-01-06       12       2.2          10     2.4         4   2.2     
2  2019-01-13        2       2.3           2     2.3         5   2.5
3  2019-01-20        5       2.2          12     2.6         6   2.4

Thanks.

score 2 · Accepted Answer · answered Feb 17 '21 at 08:36

You could reshape the data and drop the NA values so that it behaves like inner join.

tidyr::pivot_wider(df,names_from = version, values_from = c(quantity, price)) %>%
  na.omit()

#   date       quantity_A quantity_B quantity_C price_A price_B price_C
#  <chr>           <dbl>      <dbl>      <dbl>   <dbl>   <dbl>   <dbl>
#1 2019-01-06         12         10          4     2.2     2.4     2.2
#2 2019-01-13          2          2          5     2.3     2.3     2.5
#3 2019-01-20          5         12          6     2.2     2.6     2.4

How to do grouping and inner join using dplyr

1 Answers1