I understand from this question here that coefficients are the same whether we use a lm regression with as.factor() and a plm regression with fixed effects.
N <- 10000
df <- data.frame(a = rnorm(N), b = rnorm(N),
region = rep(1:100, each = 100), year = rep(1:100, 100))
df$y <- 2 * df$a - 1.5 * df$b + rnorm(N)
model.a <- lm(y ~ a + b + factor(year) + factor(region), data = df)
summary(model.a)
# (Intercept) -0.0522691 0.1422052 -0.368 0.7132
# a 1.9982165 0.0101501 196.866 <2e-16 ***
# b -1.4787359 0.0101666 -145.450 <2e-16 ***
library(plm)
pdf <- pdata.frame(df, index = c("region", "year"))
model.b <- plm(y ~ a + b, data = pdf, model = "within", effect = "twoways")
summary(model.b)
# Coefficients :
# Estimate Std. Error t-value Pr(>|t|)
# a 1.998217 0.010150 196.87 < 2.2e-16 ***
# b -1.478736 0.010167 -145.45 < 2.2e-16 ***
library(lfe)
model.c <- felm(y ~ a + b | factor(region) + factor(year), data = df)
summary(model.c)
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# a 1.99822 0.01015 196.9 <2e-16 ***
# b -1.47874 0.01017 -145.4 <2e-16 ***
However, the R and R-squared differ significantly. Which one is correct and how does the interpretation changes between the two models? In my case, the R-squared is much larger for the plm specification and is even negative for the lm + factor one.