0

I am trying to calculate the number of weekdays between two dates in a data frame.

I am using the solution given here. The solution works when dates are available in all columns, but if any dates are missing, then there are no results.

Here is the code being used:

library(dplyr)

# The macro to calculate working days
Nweekdays <- Vectorize(function(a, b)
  sum(!weekdays(seq(a, b, "days")) %in% c("Saturday", "Sunday")))

# Sample data frame
id = c("ID1", "ID2", "ID3") 
startDate = c("2019-08-01", "2019-08-06", "2019-08-10") 
endDate = c("2019-08-05", "2019-08-15", "2019-08-20")
df = data.frame(id, startDate, endDate)

# Using dplyr to coerce to Date and run macro
df <- df %>%
  mutate(startDate = as.Date(startDate)) %>% 
  mutate(endDate = as.Date(endDate)) %>% 
  mutate(workingdays = Nweekdays(startDate, endDate))

The code works correctly and gives me a new column with working days. But if one of the dates is missing or NA, e.g.

startDate = c("2019-08-01", "", "2019-08-10")

then I get

Evaluation error: 'to' must be a finite number.

and there is no new column generated. I want an empty result for the missing value, but the correct result for all others. I am sure I am missing something basic so apologies for that!!

Kaveh1000
  • 153
  • 1
  • 11

1 Answers1

1

You just need to update your function to deal with non-date values so it only tries to compute if both a and b are dates:

Nweekdays <- Vectorize(function(a, b) {
  if (!is.na(a) & !is.na(b)) {
    sum(!weekdays(seq(a, b, "days")) %in% c("Saturday", "Sunday"))
  } else {
    return(NA)
  }
})

you can use some more strict form of validation rather than !is.na() with something like lubridate::is.Date(), but this is a base solution and any non-date value will convert to NA when you call as.Date() in the mutate line.

Geoffrey Grimm
  • 281
  • 2
  • 6