I have a data set on evacuation that is essentially:
Start | End | Evac_num | date_time |
---|---|---|---|
loc_1 | loc_2 | 2000 | 30-09-2020 16:00 |
Where start is the starting location ID, end is where they evacuate to (end location ID), the number of evacuees and the date and time that this was recorded. Start and End combinations are repeated for various date/times.
I ran some OLS regressions with
r1<- lm(y ~ x, data=df)
as well as fixed effects models with
fe1 <- felm(y ~ x | date_time, data=df)
and found my data was heteroskedastic after running a Breusch-Pagan test. I have decided to then do some Generalised Least Square (GLS) models to account for this issue, which works well for the OLS models, however I do not know how to add in date_time fixed effects.
For the GLS models I did:
df$resi <- r1$residuals
varfunc.ols1 <- lm(log(resi^2) ~ x, data = df)
df$varfunc <- exp(varfunc.ols1$fitted.values)
r1.gls <- lm(y ~ x, weights = 1/sqrt(varfunc), data = df)
summary(r1.gls)
summary(varfunc.ols3)
I'm not sure the best way to run a GLS model with Fixed Effects in R? I looked into the pggls
command in the plm
package with something like:
fgls_1 <- pggls(y~x, data=df, model="within", effect="time", index=c("Start", "date_time"))
I was getting this error from the above model:
Warning: duplicate couples (id-time) in resulting pdata.frame to find out which, use, e.g., table(index(your_pdataframe), useNA = "ifany")Error in pdim.default(index[[1L]], index[[2L]]) : duplicate couples (id-time)
To deal with this issue, I combined the Start and End IDs into a single column (location_id) whichis basically start.end (e.g. if start was 123 and end was 234, it's now 123.234) as I thought this repetition of the Start ID was causing my duplicated error, as shown below:
fgls_1 <- pggls(y~x, data=df, model="within", effect="time", index=c("location_id", "date_time"))
but now I am getting the error that "duplicated row-names are not allowed".
Does anyone have any idea how to handle this? Would it be better if I gave date/time seperate columns? Or am I thinking about adding fixed effects to GLS all wrong?