-1

dataset

I have a dataset like this. I would like to count number of of times a customer visited according to the date in an whole year in R. (Using UniSA_Customer_No and Sale_Date)

There are few time a customer no and date are repeated. I need to group all the date and customer no and find how many times customer number has visited in whole year

Jay
  • 21
  • 6
  • 1
    Images are not the right way to share data/code. Add them in a reproducible format which is easier to copy. Read about [how to give a reproducible example](http://stackoverflow.com/questions/5963269). – Ronak Shah Apr 28 '21 at 06:15

2 Answers2

2

You can do this by "tabulating" each customer with:

table(year2014$UniSA_Customer_No)

It is possible to compare 2 variables for example:

tabulate(year2014$UniSA_Customer_No, year2014$Sale_Date)

However, in this case I'd suggest removing duplicates first (see this answer for details).

#select data from the year 2014
year2014 <- year2014[grep("^2014-", year2014$Sale_Date),]
#extract only columns to define duplicates
cust_date <- cbind(year2014$UniSA_Customer_No, year2014$Sale_Date)
#detect duplicates
dup_rows <- duplicated(cust_date)
#subset to unique rows
year2014unique <- year2014[!dup_rows,]
#tabulate without duplicates (customers counted once per day)
table(year2014unique$UniSA_Customer_No, year2014unique$Sale_Date)

For minimal example:

> unique(c(1, 2, 3, 1))
[1] 1 2 3
> table(c(1, 2, 3, 1))

1 2 3 
2 1 1 

There is no need for external packages to do this.

Tom Kelly
  • 1,458
  • 17
  • 25
0

You can use count :

library(dplyr)
library(lubridate)

df %>% count(UniSA_Customer_No, Sale_Date = year(as.Date(Sale_Date)))

In base R, use table :

table(df$UniSA_Customer_No, format(as.Date(df$Sale_Date), '%Y'))
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213