0

I have a dataframe CLIENTS with: Hotel number Nationality Date of arrive Date of departure Something like:

Client   Nationality   Hotel   Dateofarrive   DateofDeparture
Cl1          es         h1      21/07/2019    24/07/2019
Cl2          es         h1      23/07/2019    24/07/2019
Cl3          es         h1      06/07/2019    10/07/2019
Cl4          es         h2      05/07/2019    06/07/2019
Cl5          fr         h3      01/07/2019    02/07/2019
Cl6          pt         h1      07/07/2019    09/07/2019

And I created a dataframe Hotel and for each hotel of the dataframe i have all the clients that are of spanish nationality and all the nights they have sleept in the hotel, the same for french nationality and portuguese. Something like:

Hotel    CliEspan  Nights  CliFrench Night CliPortug Night 
H1           3       8       0         0          1      2
H2           1       5       1         1          0      0

Of all the nights the clients stay in the hotel I also want the number of nights that has been on a weekend. for example:

Hotel    CliEspan  Nights WdN CliFrench Night WdN CliPortug Night WdN 
H1           3       8      5    0         0   0       1      2    2
H2           1       5      3    1         1   0       0      0    0

For the moment I have this, i need to include the weekends days.

clients[, nights := as.numeric(CL_DATASORTIDA - CL_DATAENTRADA)]

clients$CL_NACIONALITAT<-as.factor(as.character(clients$CL_NACIONALITAT))

clients$CL_NACIONALITAT<-substring(clients$CL_NACIONALITAT, 1,2)
clients$AT_REGISTRECOMERC<-substring(clients$AT_REGISTRECOMERC, 1,6)

# dcast to wide format
suma <- function(x)base::sum(x, na.rm=TRUE)

hotel <- dcast(clients, AT_REGISTRECOMERC ~ CL_NACIONALITAT,  value.var = 'nights', fun.aggregate = list(suma, length))

names(hotel) <- gsub('nights_length', 'clients', names(new))

hotel<- data.frame(new)

inds <- which(colSums(hotel[, 90:177], na.rm=TRUE) < 20)

hotel$nights_other<-rowSums(hotel[, as.numeric(inds) + 1], na.rm=TRUE)

hotel$visitants_other<-rowSums(hotel[, as.numeric(inds) + 89], na.rm=TRUE)

hotel<-hotel[-c(inds+1, inds+89)]

AnnaIC
  • 49
  • 4
  • 1
    See https://stackoverflow.com/questions/26441700/how-to-determine-if-date-is-a-weekend-or-not-not-using-lubridate – Eli Berkow Aug 14 '19 at 07:45

1 Answers1

1

Here's a way to do it.

First create a function that counts the number of weekend nights in a date sequence.

library(dplyr)

n_weekend_nights <- function(arrival, departure) {

    if (arrival == departure) {
      return(0)
      # arrival %>% format("%u") %>% as.numeric() %>% {. %in% 5:7}
    } else {
      seq.Date(arrival, departure-1, "day") %>%
        format("%u") %>%
        as.numeric() %>%
        {. %in% 5:7} %>%
        sum()
    }
}

Then mapply it on your arrival and departure dates.

clients$WdN <- mapply(n_weekend_nights, clients$Dateofarrive, clients$DateofDeparture)

clients

# Client Nationality Hotel Dateofarrive DateofDeparture   WdN
# <chr>  <chr>       <chr> <date>       <date>          <int>
# Cl1    es          h1    2019-07-21   2019-07-24          1
# Cl2    es          h1    2019-07-23   2019-07-24          0
# Cl3    es          h1    2019-07-06   2019-07-10          2
# Cl4    es          h2    2019-07-05   2019-07-06          1
# Cl5    fr          h3    2019-07-01   2019-07-02          0
# Cl6    pt          h1    2019-07-07   2019-07-09          1
Aron Strandberg
  • 3,040
  • 9
  • 15
  • Gives me error: Error in seq.int(0, to0 - from, by) : wrong sign in 'by' argument – AnnaIC Aug 14 '19 at 09:06
  • This would happen if on one or more row, the departure date is not later than the arrival date. Is nights >= 1 for all clients? – Aron Strandberg Aug 14 '19 at 09:38
  • It could nights = 0, because some clients can be in a hotel just for some hours. – AnnaIC Aug 14 '19 at 09:59
  • I've added an if statement that handles these guests. Since they did not sleep any nights at hotel, it makes most sense to return 0 irregardless of when they checked in. – Aron Strandberg Aug 14 '19 at 11:01