0

Building on Check if a date is within an interval in R, we want to see if a specific event falls into a timeframe specified by another event. To give you a concrete example: For each country, did event (battle/protests/...) happen at the time of elections?

country <- c("Angola","Angola","Angola","Angola","Angola", "Benin","Benin","Benin","Benin","Benin","Benin")
event_type <- c("battle", "protests","riots", "riots", "elections","elections","protests","riots","violence","riots","elections")
event_date <- as.Date(c("2017-06-16", "2017-01-23", "2016-03-15", "2017-09-18", "2017-08-23", "2019-04-18", "2019-03-12", "2019-04-14", "2018-03-15", "2015-09-14", "2016-03-20"))
start_ecycle <- as.Date(c(NA,NA,NA,NA,"2017-05-25", "2019-01-18",NA,NA,NA,NA,"2015-12-21"))
end_ecycle <-as.Date(c(NA,NA,NA,NA,"2017-09-22","2019-05-18",NA,NA,NA,NA,"2016-04-19"))

mydata <- data.frame(country, event_type, event_date, start_ecycle, end_ecycle)

To this end, we created an interval variable

library(lubridate)
is.instant(mydata$start_ecycle); is.instant(mydata$end_ecycle)
mydata$ecycle <- interval(mydata$start_ecycle, mydata$end_ecycle)

Now, we got stuck. This is what the data.frame should look like in the end - i.e. here column G "ecycle_within" is added with 1 if event_date falls within ecycle (per country):

enter image description here

Any help much appreciated. Thanks!

TiF
  • 615
  • 2
  • 12
  • 24
  • Why are Angola riots within the ecycle? There is no start and end date, so how can it be within? – vanao veneri Aug 06 '19 at 16:47
  • Thanks - this is exactly our problem. From the ecycle column (that only has an entry for event_type = elections), we want to infer for every country-event_date whether it falls within this cycle. This is why the new column ‘ecycle_within’ for Angola June 16 is coded with 1 but January 23 not. Note that for some countries there are more than one elections, so more than one possible ecycle to compare a country-event_date with. – TiF Aug 06 '19 at 19:14

1 Answers1

2

Based on your comment about the elections cycles being across rows, I would recommend creating a separate dataset first with the elections data.

You can then join the election dates table. This will create a duplicate row for each event and election date range though.

The %within% lubridate function can then be used to check whether an event is within a specific election date range.

Lastly I reduce the number of rows, by filtering out rows corresponding to election date ranges that aren't relevant.

I am more familiar with dplyr and purrr and used them to implement it below. But you should be able to do something similar with base-r functions too.

I got the output close to your required output. But not 100% sure why you would like to do it this way.


library(tidyverse)
library(lubridate)
library(purrr)

elections <- mydata %>% 
  as_tibble() %>% 
  select(country, event_type, start_ecycle, end_ecycle) %>% 
  filter(event_type == "elections") %>% 
  mutate(election_year = year(start_ecycle)) %>% 
  select(country, start_ecycle, end_ecycle, election_year)

mydata2 <- mydata %>% 
  as_tibble() %>% 
  mutate(row = row_number()) %>% 
  select(row, country, event_type, event_date) %>% 
  left_join(elections, by = "country") %>% 
  mutate(ecycle = map2(start_ecycle, end_ecycle, ~ interval(.x, .y))) %>% 
  mutate(ecycle_within = map2_int(event_date, ecycle, ~ .x %within% .y)) %>% 
  select(-ecycle) %>% 
  group_by(country, event_type, event_date) %>% 
  arrange(desc(ecycle_within)) %>% 
  slice(1:1) %>% 
  ungroup() %>% 
  arrange(row) %>% 
  select(-row)

mydata2 %>% select(-election_year)

#> # A tibble: 11 x 6
#>    country event_type event_date start_ecycle end_ecycle ecycle_within
#>    <fct>   <fct>      <date>     <date>       <date>             <int>
#>  1 Angola  battle     2017-06-16 2017-05-25   2017-09-22             1
#>  2 Angola  protests   2017-01-23 2017-05-25   2017-09-22             0
#>  3 Angola  riots      2016-03-15 2017-05-25   2017-09-22             0
#>  4 Angola  riots      2017-09-18 2017-05-25   2017-09-22             1
#>  5 Angola  elections  2017-08-23 2017-05-25   2017-09-22             1
#>  6 Benin   elections  2019-04-18 2019-01-18   2019-05-18             1
#>  7 Benin   protests   2019-03-12 2019-01-18   2019-05-18             1
#>  8 Benin   riots      2019-04-14 2019-01-18   2019-05-18             1
#>  9 Benin   violence   2018-03-15 2019-01-18   2019-05-18             0
#> 10 Benin   riots      2015-09-14 2019-01-18   2019-05-18             0
#> 11 Benin   elections  2016-03-20 2015-12-21   2016-04-19             1

Overlytic
  • 143
  • 7