I'm trying to create a linear model based off a time series analysis such as the following:
Model 1 = novice_crash ~ time + grad + time.after + month
I have the following code that creates the variables in question above:
grad<- c(replicate(66,0),replicate(30,1))
grad<- ts(grad, start=c(2002,1), frequency=12)
time<- seq(1,96, by=1)
time<- ts(time,start=c(2002,1), frequency = 12)
time.after<- c(replicate(66,0),replicate(30,1))
time.after<- ts(time.after, start=c(2002,1), frequency = 12)
#month<- seasonaldummy(novice_crashes)
month<-time
grad.lag1<- lag(grad)
time.after.lag1<- lag(time.after)
'novice_crashes' is a ts object that comes from the following code (where 'crashes' is a csv file
novice<- crash$novice_crash
total<- crash$total_crash
novice_crashes<-ts(novice, start = c(2002,12), end=c(2009,12), frequency = 12)
When I try to run this model1<- lm(novice_crashes ~ time + grad + time.after + month)
I get the following error:
Error in model.frame.default(formula = novice_crashes ~ time + grad + : variable lengths differ (found for 'time')
I have checked the lengths of time, grad, time.after and month (which are all 96 units long).
The dataset crash
had NA's present but I removed with
crash<- na.omit(crash)
Im much more used to python so I may be missing something here...