0

Hi I am new to R and would like to know if there is a simple way to filter data over multiple dates. I have a data which has dates from 07.03.2003 to 31.12.2016. I need to split/ filter the data by multiple time series, as per below.

Dates require in new data frame: 07.03.2003 to 06/03/2005 and 01/01/2013 to 31/12/2016

i.e the new data frame should not include dates from 07/03/2005 to 31/12/2012

Jaap
  • 81,064
  • 34
  • 182
  • 193
Allan H
  • 23
  • 4
    Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). This will make it much easier for others to help you. – Jaap Jun 12 '17 at 19:00

2 Answers2

0

Let's take the following data.frame with dates:

df <- data.frame( date = c(ymd("2017-02-02"),ymd("2016-02-02"),ymd("2014-02-01"),ymd("2012-01-01")))

        date
1 2017-02-02
2 2016-02-02
3 2014-02-01
4 2012-01-01

I can filter this for a range of dates using lubridate::ymd and dplyr::between and dplyr::between:

df1 <- filter(df, between(date, ymd("2017-01-01"), ymd("2017-03-01")))

        date
1 2017-02-02

Or:

df2 <- filter(df, between(date, ymd("2013-01-01"), ymd("2014-04-01")))

        date
1 2014-02-01
CPak
  • 13,260
  • 3
  • 30
  • 48
0

I would go with lubridate. In particular

library(data.table)   
library(lubridate)

set.seed(555)#in order to be reproducible
N <- 1000#number of pseudonumbers to be generated

date1<-dmy("07-03-2003")
date2<-dmy("06-03-2005")
date3<-dmy("01-01-2013")
date4<-dmy("31-12-2016")

Creating data table with two columns (dates and numbers):

my_dt<-data.table(date_sample=c(sample(seq(date1, date4, by="day"), N),numeric_sample=sample(N,replace = F)))

> head(my_dt)
     date_sample   numeric_sample
1:  2007-04-11              2
2:  2006-04-20             71
3:  2007-12-20             46
4:  2016-05-23             78
5:  2011-10-07              5
6:  2003-09-10             47

Let's impose some cuts:

forbidden_dates<-interval(date2+1,date3-1)#create interval that dates should not fall in.

> forbidden_dates
[1] 2005-03-07 UTC--2012-12-31 UTC
test_date1<-dmy("08-03-2003")#should not fall in above range
test_date2<-dmy("08-03-2005")#should fall in above range

Therefore:

test_date1 %within% forbidden_dates
[1] FALSE
test_date2 %within% forbidden_dates
[1] TRUE

A good way of visualizing the cut:

before

>plot(my_dt)

enter image description here

my_dt<-my_dt[!(date_sample %within% forbidden_dates)]#applying the temporal cut

after

>plot(my_dt)

enter image description here

amonk
  • 1,769
  • 2
  • 18
  • 27