Aggregate by column and compare the text

Question

I have two dataframes:

   VESSELNO        Equip  
     12234          503
     12234          504
     12234          505
     12234          506
     12231          502
     12231          503
     12231          503

other dataframe has following columns

    VESSELNO        Equip  
     12234          503
     12234          604
     12234          605
     12234          506
     12231          602
     12231          603
     12231          503

I want to calculate count of unique Equip combining both dataframes We can use data.table to group by VESSELNO and get the Equip

  dt[,paste(Equip,collapse = ","), by = VESSELNO]

How can I then compare unique counts?

 VESSELNO         Equip  
     12234        (503,504,505,506,604,605)  = 6
     12231        (502,503,602,603) = 4

library(dplyr) dt <- dt%>%group_by(VESSELNO)%>%summarise(x = n_distinct(Equip)) — AntonCH, Jul 03 '17 at 07:46

score 4 · Accepted Answer · answered Jul 03 '17 at 07:45

We can rbind the two datasets, grouped by 'VESSELNO' get the count of unique elements with uniqueN

library(data.table)
setDT(rbind(df1, df2))[, .(Count = uniqueN(Equip)), VESSELNO]
#    VESSELNO Count
#1:    12234     6
#2:    12231     4

If we are using aggregate, then use the anonymous function to get the length of unique elements

 aggregate(Equip ~VESSELNO, rbind(df1, df2), FUN = function(x) length(unique(x)))

Aggregate by column and compare the text

1 Answers1