I am working on a panel data set. I have 20 sites' production data over 10 years. I want to estimate the effect of different pattern of rainfall (RF) on monthly production.
My data is stored in Google and looks like this:
I want to get the effect of the seasonal rainfall pattern on monthly production. My rainfall seasons are as follows:
- December (of the previous year) to February (following year) is the NE monsoon (NEM)
- March to April is 1st Inter monsoon (IM1)
- May to September is SW monsoon (SWM)
- October to November is 2nd Inter monsoon (IM2)
I need to get the total of these four patterns year wise over the 10 years from 2000 to 2010 across the cross sectional sites (n=20). I don't have the RF data for December month of year 1999 and in that case we could assume that December 1999 RF is the same as January 2000 (another suggestions would be appreciated).
So far I have coded this:
dat<-read.csv("my_data.csv")
# get rainfall (RF) and other data
RF <- dat$RF
Y <- dat$Year
Mon <- dat$Mon
Site <- dat$Site
#Specify new data frame with 4 seasons of RF over the years across different sites
Year <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))
Site <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))
Season1 <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))
Season2 <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))
Season3 <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))
Season4 <- vector(mode="numeric",length = ((Y[length(dat$Y)]-Y[1])+1)*(length(levels(Site))))
Year <- rep(seq(from = Y[1],to=Y[length(Y)]),length(levels(Site)))
number_of_Y <-Y[length(Y)]-Y[1]+1
#Site_index <- 2
for (Site_index in 1 : length(levels(Site))){
start_row <- 1+(Site_index-1)*number_of_Y
end_row <- (Site_index-1)*number_of_Y + number_of_Y
Site[start_row:end_row] <- rep(levels(Site)[Site_index],(Y[length(Y)]-Y[1]+1))
}
But it doesn't work. I am not understanding why the "Site" does not get its levels from the above codes and how to get the total of each RF pattern yearly across sites as a new data frame.