1

I have time series data and I am trying to subset the following:

1) periods between specific years (beginning 12AM January 1 and ending 11pm December 31) 2) periods without specific months

These are two independent subsets I am trying to do.

Given the following dataframe:

test <- data.frame(seq(from = as.POSIXct("1983-03-09 01:00"), to = as.POSIXct("1985-01-08 00:00"), by = "hour"))
colnames(test) <- "DateTime"
test$Value<-sample(0:100,16104,rep=TRUE)

I can first create Year and Month columns and use these to subset:

# Add year column
test$Year <- as.numeric(format(test$DateTime, "%Y"))

# Add month column
test$Month <- as.numeric(format(test$DateTime, "%m"))

# Subset specific year (1984 in this case)
sub1 = subset(test, Year!="1983" & Year!="1985")

# Subset specific months (April and May in this case)
sub2 = subset(test, Month=="4" | Month=="5")

However, I am wondering if there is a better way to do this directly from the POSIXct datetimes (without having to first create the Year and Month columns. Any ideas?

Thomas
  • 2,484
  • 8
  • 30
  • 49
  • you could use `months` from base but there is no built-in for year AFAIK. `lubridate` package has both `year` and `month`, and then subset like `test[month(test$DateTime) %in% c(4, 5), ]` – rawr Apr 11 '14 at 02:01
  • Seems kind of silly to convert to numeric and then compare to character, don't you think? – IRTFM Apr 11 '14 at 04:47

1 Answers1

2
sub1 <- subset(test, format(DateTime, "%Y") %in% c("1983" , "1985")  )
sub2 <- subset(test, as.numeric(format(DateTime, "%m")) %in% 4:5)
IRTFM
  • 258,963
  • 21
  • 364
  • 487