Note: This question was closed as a 'duplicate'. The solutions offered here and here did not answer my question. They showed how to merge when a single entry fell within a range, I'm trying to identify overlapping ranges and joining them. Perhaps my title could have been better...
I have a main data set main_df
with a start and end time (in seconds). I would like to see if the time range in main_df
falls within a list of ranges in lookup_df
, and if so, grab the value from lookup_df
. Additionally, if the main_df
falls within two different lookup ranges, duplicate the row so each value is represented.***
main_df <- tibble(start = c(30,124,161),
end = c(80,152,185))
lookup_df <- tibble(start = c(34,73,126,141,174,221),
end = c(69,123,136,157,189,267),
value = c('a','b','b','b','b','a'))
# Do something here to get the following:
> final_df
# A tibble: 4 x 4
start end value notes
<dbl> <dbl> <chr> <chr>
1 30 80 a ""
2 30 80 b "Duplicate because it falls within a and b"
3 124 152 b "Falls within two lookups but both are b"
4 161 185 b ""
***Edit: Looking at the way I've structured the problem...
#Not actual code
left_join(main_df, lookup_df, by(some_range_join_function) %>%
add_rows(through_some_means)
Rather than having to add a new row I could flip how I'm joining them...
semi_join(lookup_df, main_df, by(some_range_join_function))