-3

I want to have count of booking IDs on Month-Source Level

Month   Source  Booking_id
Oct        A    100
Nov        B    101
Oct        A    106
Jan        B    109
Nov        A    110
Nov        B    111


data <- structure(list(Month = c("October", "November", "October", "January", 
"November", "November"), Source = c("A", "B", "A", "B", "A", 
"B"), Booking_ID = c(100L, 101L, 106L, 109L, 110L, 111L)), .Names = c("Month", 
"Source", "Booking_ID"), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L))
mpalanco
  • 12,960
  • 2
  • 59
  • 67
Akshit
  • 349
  • 1
  • 3
  • 10

3 Answers3

2

Maybe This could help:

table(data$Month, data$Booking_id)

#     100 101 106 109 110 111
# Jan   0   0   0   1   0   0
# Nov   0   1   0   0   1   1
# Oct   1   0   1   0   0   0


table(data$Month, data$Source)

#     A B
# Jan 0 1
# Nov 1 2
# Oct 2 0
1

Two alternatives:

1. aggregate

aggregate(Booking_ID ~ Month + Source, data, FUN = "length")

Output:

     Month Source Booking_ID
1 November      A          1
2  October      A          2
3  January      B          1
4 November      B          2

2. sqldf

library(sqldf)
sqldf("SELECT  Month, Source, COUNT(*) AS Count FROM data GROUP BY Month, Source")

Output:

     Month Source Count
1  January      B     1
2 November      A     1
3 November      B     2
4  October      A     2
mpalanco
  • 12,960
  • 2
  • 59
  • 67
0

We can use dplyr. We group by 'Month', 'Source' and get the n_distinct of 'Booking_id' i.e. number of unique elements of 'Booking_id' or if we need the total number use n().

library(dplyr)
data %>%
  group_by(Month, Source) %>%
  summarise(n= n_distinct(Booking_ID))
  #if we wanted the total count instead of unique
  #summarise(n=n()) 

#    Month Source     n
#     (chr)  (chr) (int)
#1  January      B     1
#2 November      A     1
#3 November      B     2
#4  October      A     2
akrun
  • 874,273
  • 37
  • 540
  • 662