3

I am using R, and I need to set up a loop (I think) where I extract the month from the date and assign a season. I would like to assign winter to months 12, 1, 2; spring to 3, 4, 5; summer to 6, 7, 8; and fall to 9, 10, 11. I have a subset of the data below. I am awful with loops and couldn't figure it out. Also for the date, I wasn't sure how packages like lubridate would work

"","UT_TDS_ID_2011.Monitoring.Location.ID","UT_TDS_ID_2011.Activity.Start.Date","UT_TDS_ID_2011.Value","UT_TDS_ID_2011.Season"
"1",4930585,"7/28/2010 0:00",196,""
"2",4933115,"4/21/2011 0:00",402,""
"3",4933115,"7/23/2010 0:00",506,""
"4",4933115,"6/14/2011 0:00",204,""
"8",4933115,"12/3/2010 0:00",556,""
"9",4933157,"11/18/2010 0:00",318,""
"10",4933157,"11/6/2010 0:00",328,""
"11",4933157,"7/23/2010 0:00",290,""
"12",4933157,"6/14/2011 0:00",250,""
Roland
  • 127,288
  • 10
  • 191
  • 288
MKWalsh
  • 89
  • 1
  • 1
  • 7
  • there is confusion with the word 'season". maybe modify the title to replace "season" by "quarter" to avoid confusion with astronomical seasons – agenis Sep 18 '20 at 08:06

2 Answers2

7

Regarding the subject/title of the question, its actually possible to do this without extracting the month. The first two solutions below do not extract the month. There is also a third solution which does extract the month but only to increment it.

1) as.yearqtr/as.yearmon Convert the dates to year/month and add one month (1/12). Then the calendar quarters correspond to the seasons so convert to year/quarter, yq, and label the quarters as shown:

library(zoo)
yq <- as.yearqtr(as.yearmon(DF$dates, "%m/%d/%Y") + 1/12)
DF$Season <- factor(format(yq, "%q"), levels = 1:4, 
                labels = c("winter", "spring", "summer", "fall"))

giving:

       dates Season
1  7/28/2010 summer
2  4/21/2011 spring
3  7/23/2010 summer
4  6/14/2011 summer
5  12/3/2010 winter
6 11/18/2010   fall
7  11/6/2010   fall
8  7/23/2010 summer
9  6/14/2011 summer

1a) A variation of this is to use chron's quarters which produces a factor so that levels=1:4 does not have to be specified. To use chron replace the last line in (1) with:

library(chron)
DF$Season <- factor(quarters(as.chron(yq)), 
                labels = c("winter", "spring", "summer", "fall"))

chron could also be used in conjunction with the remaining solutions.

2) cut. This solution only uses the base of R. First convert the dates to the first of the month using cut and add 32 to get a date in the next month, d. The quarters corresponding to d are the seasons so compute the quarters using quarters and construct the labels in the same fashion as the first answser:

d <- as.Date(cut(as.Date(DF$dates, "%m/%d/%Y"), "month")) + 32
DF$Season <- factor(quarters(d), levels = c("Q1", "Q2", "Q3", "Q4"), 
   labels = c("winter", "spring", "summer", "fall"))

giving the same answer.

3) POSIXlt This solution also uses only the base of R:

p <- as.POSIXlt(as.Date(DF$dates, "%m/%d/%Y"))
p$day <- 1
p$mo <- p$mo+1
DF$Season <- factor(quarters(p), levels = c("Q1", "Q2", "Q3", "Q4"), 
               labels = c("winter", "spring", "summer", "fall"))

Note 1: We could optionally omit levels= in all these solutions if we knew that every season appears.

Note 2: We used this data frame:

DF <- data.frame(dates = c('7/28/2010', '4/21/2011', '7/23/2010', 
 '6/14/2011', '12/3/2010', '11/18/2010', '11/6/2010', '7/23/2010', 
 '6/14/2011'))
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
1

Using only base R, you can convert the "datetime" column to "Date" class (as.Date(..)), extract the "month" (format(..., '%m')) and change the character value to numeric (as.numeric(). Create an "indx" vector that have values from "1" to "12", set the names of the values according to the specific season (setNames(..)), and use this to get the corresponding "Season" for the "months" vector.

 months <- as.numeric(format(as.Date(df$datetime, '%m/%d/%Y'), '%m'))
 indx <- setNames( rep(c('winter', 'spring', 'summer',
                   'fall'),each=3), c(12,1:11))

 df$Season <- unname(indx[as.character(months)])
 df
 #        datetime Season
 #1  7/28/2010 0:00 summer
 #2  4/21/2011 0:00 spring
 #3  7/23/2010 0:00 summer
 #4  6/14/2011 0:00 summer
 #5  12/3/2010 0:00 winter
 #6 11/18/2010 0:00   fall
 #7  11/6/2010 0:00   fall
 #8  7/23/2010 0:00 summer
 #9  6/14/2011 0:00 summer

Or as @Roland mentioned in the comments, you can use strptime to convert the "datetime" to "POSIXlt" and extract the month ($mon)

 months <- strptime(df$datetime, format='%m/%d/%Y %H:%M')$mon +1

and use the same method as above

data

  df <- data.frame(datetime = c('7/28/2010 0:00', '4/21/2011 0:00', 
 '7/23/2010 0:00', '6/14/2011 0:00', '12/3/2010 0:00', '11/18/2010 0:00',
  '11/6/2010 0:00', '7/23/2010 0:00', '6/14/2011 0:00'),stringsAsFactors=FALSE)
Community
  • 1
  • 1
akrun
  • 874,273
  • 37
  • 540
  • 662