Count Number of Visits by Unique Date

Question

I am trying to create a column that counts the number of unique visits to a site based on the grouping. However in my current operation is counting the same date as visit_3 and visit_4 because of having captures and recaptures. How do I not pick simple count the rows by the grouping but only by unique dates per site.

With my current process the last observation at "admin_pond on "2022-05-19" for the capture_type of "recapture" should have "visit_3" but it is showing at "visit_4". I want the unique number of visits per site and date irregardless of capture_type. So the last two values observations should show "visit_3" because they happened on the same date at the same site.

Data

data <- structure(list(site = c("admin_pond", "admin_pond", "admin_pond", 
"admin_pond"), date = structure(c(19123, 19130, 19131, 19131), class = "Date"), 
    n = c(9L, 15L, 11L, 9L), capture_type = c("new", "new", "new", 
    "recapture")), row.names = c(NA, -4L), class = c("tbl_df", 
"tbl", "data.frame"))

Method

bull_frog_visits <- data %>% 
  group_by(site) %>% 
  mutate(n_visit = 1:n(),
         n_visit = paste0("visit_", n_visit, sep = ""))

head(bull_frog_visits)

I don't understand why "visit_3" and not "visit_2" since there are only two rows with the same site and date. — Rui Barradas, May 22 '23 at 19:40
@RuiBarradas I think they want to enumerate unique `date`s within a `site`. — Gregor Thomas, May 22 '23 at 19:41

score 4 · Accepted Answer · answered May 22 '23 at 19:41

This should work:

data %>% 
  group_by(site) %>% 
  mutate(n_visit = match(date, unique(date)),
         n_visit = paste0("visit_", n_visit, sep = ""))
# # A tibble: 4 × 5
# # Groups:   site [1]
#   site       date           n capture_type n_visit
#   <chr>      <date>     <int> <chr>        <chr>  
# 1 admin_pond 2022-05-11     9 new          visit_1
# 2 admin_pond 2022-05-18    15 new          visit_2
# 3 admin_pond 2022-05-19    11 new          visit_3
# 4 admin_pond 2022-05-19     9 recapture    visit_3

score 2 · Answer 2 · answered May 22 '23 at 19:54

2

Another option is to use consecutive_id from dplyr > 1.1.0 which is the equivalent of data.tables rleid:

data %>% 
  mutate(n_visit = paste0("visit_", consecutive_id(date)))

# A tibble: 4 × 5
  site       date           n capture_type n_visit
  <chr>      <date>     <int> <chr>        <chr>  
1 admin_pond 2022-05-11     9 new          visit_1
2 admin_pond 2022-05-18    15 new          visit_2
3 admin_pond 2022-05-19    11 new          visit_3
4 admin_pond 2022-05-19     9 recapture    visit_3

answered May 22 '23 at 19:54

TarJae

72,363
6
19
66

this may work but in my full data set I have multiple sites – Eizy May 22 '23 at 20:03
1

this is no problem we could do `data %>% mutate(n_visit = paste0("visit_", consecutive_id(date)), .by=site)`. Note `dplyr` > 1.1.0 – TarJae May 22 '23 at 20:05

Count Number of Visits by Unique Date

2 Answers2