I'm new to dplyr
, but I've been searching for hours for an answer to this without much success. I am currently trying to write a function
that will return a 0 or 1 depending on whether a time falls within a certain range, but only if it matches both the Date
and City
that is relevant. Here is the newest iteration of code I've come up with:
NEW_DF <- df1 %>%
full_join(df2, by="City", keep= TRUE) %>%
mutate(newvariable = case_when(
df1$City == df2$City & df1$Dates == df2$Dates & df1$Start <= df2$Time <= df1$End ~ 1,
df1$City == df2$City & df1$Dates == df2$Dates & df1$Start > df2$Time ~ 0,
df1$City == df2$City & df1$Dates == df2$Dates & df1$End > df2$Time ~ 0,
TRUE ~ NA_real_)) %>%
select(City=df2$City, Date=df2$Dates, Time=df2$Time, newvariable) %>%
semi_join(df2, by="City")
Ideally, this would result in a table where if given matching City and Dates, it would see if the time in df2 fell inside or outside of the Start/End range in df1. But I keep getting errors - the error for my newest code is this:
Error in select(City=df2$City, Date=df2$Dates, Time=df2$Time, newvariable), : object 'newvariable' not found
With different code, I got this error, which I'm only including to be thorough:
Error in UseMethod("select") : no applicable method for 'select' applied to an object of class "character"
I thought I might need to build a new vector for my new variable to populate, but that doesn't seem to work.