1

The attached is the data frame I am dealing with.

data frame

The first column in the data frame is the Date. I have to subset data frame based on condition on multiple months and years. For example, I want all July and September months data for the years 2005 and 2006.

I tried following code:

output <- subset(df, format.Date(Date, "%m")==c("07", "09") & format.Date(Date, "%Y")==c("2005","2006"))

The above code results the unexpected output.

I found posts regarding this problem but those were only for single month and year.

Martin Gal
  • 16,640
  • 5
  • 21
  • 39
raghav
  • 533
  • 2
  • 11
  • 2
    Please don't post data as images. Take a look at how to make a [great reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for ways of showing data. – Martin Gal Jun 07 '20 at 16:42
  • Use %in% rather than == . Also you should not, in general, refer to methods directly. Use the genreic so format.Date should be format. – G. Grothendieck Jun 07 '20 at 17:17

1 Answers1

1

If you don't mind installing tidyverse package, you can use this simple filtering:

library(tidyverse)
library(lubridate)  # should come with tidyverse, no need to install it separately

# filter July and September data in 2005 and 2006
output <- df %>%
    filter(year(Date) %in% c(2005, 2006) & month(Date) %in% c(7, 9))

If you want to use base R, this should work as well:

output <- subset(df, format(Date, "%m") %in% c("07", "09") & format(Date, "%Y") %in% c("2005", "2006"))

in case that class of df$Date column is "Date".

  • Thanks @raghav for your feedback, I'm glad it works! I have added a base R solution, which is however more difficult to read, so for basic computations I certainly recommend using tidyverse & lubridate. –  Jun 19 '20 at 19:41