0

I have the following data:

CustomerNumber | LogEntry
1                2016-02-03
1                2016-02-05
1                2016-02-07
2                2016-02-05
2                2016-02-07

And I want it summarized like this:

CustomerNumber | Frequency
1                3
2                2

I found a way of solving it, but it doesn't feel very elegant (I'm a novice).

customer_log <- active_sessions

unique <- unique(customer_log$customer_number)

session_starts = character()
customer_number = character()

for (i in 1:length(unique)) {
  session_starts[i] <- nrow(na.omit(customer_log[customer_log$customer_number == unique[i],]))
  customer_number[i] <- unique[i]
}

active_sessions = data.frame(session_starts, customer_number)

Does anyone have a suggestion on how to better solve this? Many thanks.

David Beck
  • 975
  • 8
  • 12

1 Answers1

1

You can easily achieve this using the sqldf package.

library(sqldf)
sqldf("SELECT CustomerNumber, count(*) AS Frequency FROM df GROUP BY CustomerNumber") 
CuriousBeing
  • 1,592
  • 14
  • 34