0

I have two dataframes:

   VESSELNO        Equip  
     12234          503
     12234          504
     12234          505
     12234          506
     12231          502
     12231          503
     12231          503  

other dataframe has following columns

    VESSELNO        Equip  
     12234          503
     12234          604
     12234          605
     12234          506
     12231          602
     12231          603
     12231          503  

I want to calculate count of unique Equip combining both dataframes We can use data.table to group by VESSELNO and get the Equip

  dt[,paste(Equip,collapse = ","), by = VESSELNO]

How can I then compare unique counts?

 VESSELNO         Equip  
     12234        (503,504,505,506,604,605)  = 6
     12231        (502,503,602,603) = 4  
zx8754
  • 52,746
  • 12
  • 114
  • 209
Neil
  • 7,937
  • 22
  • 87
  • 145

1 Answers1

4

We can rbind the two datasets, grouped by 'VESSELNO' get the count of unique elements with uniqueN

library(data.table)
setDT(rbind(df1, df2))[, .(Count = uniqueN(Equip)), VESSELNO]
#    VESSELNO Count
#1:    12234     6
#2:    12231     4

If we are using aggregate, then use the anonymous function to get the length of unique elements

 aggregate(Equip ~VESSELNO, rbind(df1, df2), FUN = function(x) length(unique(x)))
akrun
  • 874,273
  • 37
  • 540
  • 662