I have the arrival time and departure time and date of different customers to a system. I want to count the number of people in the system in every 30 min. How can I do this R? Here are my data
-
1It's not helpful to share pictures of data. It's easier to help you if you provide a proper [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output so we can run and test possible solutions. – MrFlick Dec 15 '17 at 19:03
-
Please provide a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – M-- Dec 15 '17 at 19:03
2 Answers
If I understand your question, here's an example with fake data:
library(tidyverse)
library(lubridate)
# Fake data
set.seed(2)
dat = data.frame(id=1:1000, type=rep(c("A","B"), 500),
arrival=as.POSIXct("2013-08-21 05:00:00") + sample(-10000:10000, 1000, replace=TRUE))
dat$departure = dat$arrival + sample(100:5000, 1000, replace=TRUE)
# Times when we want to check how many people are still present
times = seq(round_date(min(dat$arrival), "hour"), ceiling_date(max(dat$departure), "hour"), "30 min")
# Count number of people present at each time
map_df(times, function(x) {
dat %>%
group_by(type) %>%
summarise(Time = x,
Count=sum(arrival < x & departure > x)) %>%
spread(type, Count) %>%
mutate(Total = A + B)
})
Time A B Total <dttm> <int> <int> <int> 1 2013-08-21 02:00:00 0 0 0 2 2013-08-21 02:30:00 26 31 57 3 2013-08-21 03:00:00 54 53 107 4 2013-08-21 03:30:00 75 81 156 5 2013-08-21 04:00:00 58 63 121 6 2013-08-21 04:30:00 66 58 124 7 2013-08-21 05:00:00 55 60 115 8 2013-08-21 05:30:00 52 63 115 9 2013-08-21 06:00:00 57 62 119 10 2013-08-21 06:30:00 62 51 113 11 2013-08-21 07:00:00 60 67 127 12 2013-08-21 07:30:00 72 54 126 13 2013-08-21 08:00:00 66 46 112 14 2013-08-21 08:30:00 19 12 31 15 2013-08-21 09:00:00 1 2 3 16 2013-08-21 09:30:00 0 0 0 17 2013-08-21 10:00:00 0 0 0

- 91,525
- 24
- 209
- 285
I'm not sure what you mean by counting the number of people "in the system", but I'm assuming you mean "the number of people who have arrived but not yet departed". To do this, you can apply a simple logical condition on the relevant columns of your dataframe, e.g.
logicVec <- df$arrival_time <= dateTimeObj & dateTimeObj < df$departure_time
LogicVec will evidently be a logical vector of TRUEs and FALSEs. Because TRUE == 1 and FALSE == 0, you can then simply use the sum(logicVec)
function to get the the total number of people/customers/rows who fulfill the condition written above.
You can then simply repeat this line of code for every dateTimeObj (of class e.g. POSIXct) you want. In your case, it would be every dateTimeObj where each are 30 minutes apart.
I hope this helps.

- 173
- 1
- 10