0

I am using R to run a hedonic price analysis that aims to estimate the impact of a local shock to a county using a difference-in-difference approach.

My dependent variable is the price of a house, and I would like to account for unobserved neighborhood factors as a part of my parameters to be estimated. I believe that I should be including fixed effects at the zip code level and have the data to do so.

My current model specification is below:

reg_1 = lm(logAppr18 ~ Dist_CBD_Miles + SqFeet + Bedrooms + BathsTotal + YearBuilt.f + year_dum + Bio + year_dum:Bio, data = reg_1)

How would I incorporate fixed effects at the zip code level into this model in R?

Martin
  • 2,411
  • 11
  • 28
  • 30
  • 1
    The problem with this approach is that you are "spending" quite a few degrees of freedom for a categorical variable that you know has some spatial dependence. I think this is less a coding question and more a "needs a statistician" question. – IRTFM Jul 20 '18 at 01:02

1 Answers1

0

1) Providing reproducible examples is your best bet for getting a quick and accurate response:[How to make a great R reproducible example?

2) If you are asking how to code dummy variables to be used in a fixed effect model, you can use model.matrix. Below is an example how how it may be used.

dat <- data.frame("value" = c(4,3,2,3), "zip"=c("20002","20021","20021","20202"))

dummy <- model.matrix(~zip,dat)

newdat <- cbind(dat,dummy)

>newdat
  value   zip (Intercept) zip20021 zip20202
1     4 20002           1        0        0
2     3 20021           1        1        0
3     2 20021           1        1        0
4     3 20202           1        0        1

You can then exclude the "zip" feature from your model.

Peter_Evan
  • 947
  • 10
  • 17