0

I would like know how many animals will show up on a specific day. This chart describes people register their animals in advance.

For instance, at 7 days ahead, someone registered for their 4 cats to show up on 5/3/2019; at 6 days ahead, another 9 cats are registered for 5/3/2019. So there will be 7+6=13 cats showing up on 5/3/2019.

When days_ahead = 0, it simply means someone registered on the event day. For instance, 4 wolves registered for 5/1/2019 on 5/1/2019 (0 days ahead), and there will be 4 wolves that day.

library(dplyr)
set.seed(0)

animal = c(rep('cat', 5), rep('dog', 6), rep('wolf', 3))
date = sample(seq(as.Date("2019/5/1"), as.Date('2019/5/10'), by='day'), 14, replace=TRUE)
days_ahead = sample(seq(0,14), 14, replace=FALSE)
number = sample.int(10, 14, replace=TRUE)

dt = data.frame(animal, date, days_ahead, number) %>% arrange(animal, date)

The expected outcome should have the same 1-3 columns as the example, but the fourth column should be an accumulated number by each date, accumulating on days_ahead.


I added an expected outcome here. The comments are used to explain the accumulated_number column.

enter image description here

I've considered loop function but not entirely sure how to loop over three variables (cat, date, and days_ahead). Any advice is appreciated!!

Rachel Zhang
  • 562
  • 6
  • 20
  • The question is not very clear, in part because the descriptions in the text don't match the example data. Also it's not clear how date and days ahead are related - should the latter be subtracted from the former to give the date the registration was made? It would help to improve the example data. – neilfws Jul 03 '19 at 12:12
  • Thanks for your comments! I added an expected output. – Rachel Zhang Jul 03 '19 at 12:28
  • 1
    Possible duplicate https://stackoverflow.com/questions/30400024/cumulative-total-by-group – Ronak Shah Jul 03 '19 at 13:29
  • Using base R `ave` you can do `with(dt, ave(number, animal, date, FUN = cumsum))` – Ronak Shah Jul 03 '19 at 13:31

2 Answers2

2

The accumulated_number is somewhat easy with cumsum(). See this link for your comments field:

Cumulatively paste (concatenate) values grouped by another variable

dt%>%
  group_by(animal,date)%>%
  mutate(accumulated_number = cumsum(number)
         ,comments = Reduce(function(x1, x2) paste(x1, x2, sep = '+'), as.character(number), accumulate = T)
         )%>%
  ungroup()

Also, my dataset is slightly different than yours with the same seed. Still, it seems to work.

# A tibble: 14 x 6
   animal date       days_ahead number accumulated_number comments
   <fct>  <date>          <int>  <int>              <int> <chr>   
 1 cat    2019-05-03         10      9                  9 9       
 2 cat    2019-05-04          6      4                  4 4       
 3 cat    2019-05-06          8      5                  5 5       
 4 cat    2019-05-09          5      4                  4 4       
 5 cat    2019-05-10         13      6                  6 6       
 6 dog    2019-05-01          0      2                  2 2       
 7 dog    2019-05-03          3      5                  5 5       
 8 dog    2019-05-07          1      7                  7 7       
 9 dog    2019-05-07          9      8                 15 7+8     
10 dog    2019-05-09         12      2                  2 2       
11 dog    2019-05-10          7      9                  9 9       
12 wolf   2019-05-02         14      5                  5 5       
13 wolf   2019-05-03         11      8                  8 8       
14 wolf   2019-05-07          4      9                  9 9 
Cole
  • 11,130
  • 1
  • 9
  • 24
  • 1
    I don't think OP need `comments` column, it is only for explanation. It is just `cumsum` by group. – Ronak Shah Jul 03 '19 at 13:22
  • 1
    @Ronak I think you're right. The ```comments``` was kind of fun to research, though. I'm keeping it in. – Cole Jul 03 '19 at 13:31
0

I'm not sure I understand your question, is this what you want?

I'm adding an "animals_arriving" column and kepping the rest of dt

library(dplyr)
library(lubridate)
dt %>% 
  mutate(date_arrival = date + days(days_ahead)) %>%
  group_by(date = date_arrival) %>% 
  summarise(animals_arriving = n()) %>% 
  full_join(dt,by="date")
Fino
  • 1,774
  • 11
  • 21
  • Hi Fino, thanks for your answer. I ran your code but I wasn't entirely sure about the results - it seems like row 13 (5/3/2019) has NA animals arrival, but there are cats coming. I would like to know on each `date` how many cats/dog/wolves will be there. – Rachel Zhang Jul 03 '19 at 12:13