I am relatively new to R and I'm hoping that someone could answer this question for me. I have four columns in a data frame proc_id, mco_id, start_date, and end_date. PLEASE CLICK ON THIS IMAGE TO SEE THE DATAFRAME. This is the logic that I am going after. For each combination of proc_id and mco_id, if the start_date comes right after the preceding end_date, then the final dataframe for that combination of proc_id and mco_id is the minimum start date and maximum end date.
for instance, the first three rows contains 1234 and ABC for proc_id and mco_id respectively. The start date in line 3 of the data frame is one day after the end date of line 2, and the start date of line 2 is one day after end date of line 1. So, my final dataframe for proc_id and mco_id of 1234 and ABC must have a start date of '2014-01-01' and end date of '2014-07-01'. Now if the start date for a combination of proc_id and mco_id is greater than 1 day of the lagging end date, then they stay as is. Finally, if the start date comes before the lagging end date, then similar to the first instance, the minimum start date and the maximum end date for the combination of proc_id and mco_id is considered. So, here is what I would expect for the final dataframe.
This is the final data frame that I would like to get.
Any help would be greatly appreciated.