read_dta("PSID.dta")
PSID <- read_dta("PSID.dta")
new_id <- PSID[PSID$id>=5000,]
I am trying to create a linear regression model of such, both OLS and Fixed Effect Estimator on RStudio based on a panel data set where the variables are:
var_list = c("id","year","txhw","totaldonation")
and the linear model regression I am willing to produce is:
log(totaldonation) = β0 + β1 log(txhwit) + α + u
α represents the unobserved individual heterogeneity of individual households.
My coding is as below, but it is giving me errors
reg_ols <- lm(log(totaldonation) ~ log(txhw), data=new_id)
Above reg_ols gives an error "Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'x'"
reg_fe <- plm(log(totaldonation) ~ log(txhw), data=new_id, method="within")
Above reg_fe gives me an error in model.matrix.pdata.frame(data, rhs = 1, model = model, effect = effect, : model matrix or response contains non-finite values (NA/NaN/Inf/-Inf)
There's no NA values within my data set, what could I do to resolve these problems?
I've tried using complete cases as below, but am not too sure if it is the right method.
new_id <- new_id[complete.cases(new_id),]