1
read_dta("PSID.dta")

PSID <- read_dta("PSID.dta")

new_id <- PSID[PSID$id>=5000,]

I am trying to create a linear regression model of such, both OLS and Fixed Effect Estimator on RStudio based on a panel data set where the variables are:

var_list = c("id","year","txhw","totaldonation")

and the linear model regression I am willing to produce is:

log(totaldonation) = β0 + β1 log(txhwit) + α + u

α represents the unobserved individual heterogeneity of individual households.

My coding is as below, but it is giving me errors

reg_ols <- lm(log(totaldonation) ~ log(txhw), data=new_id)

Above reg_ols gives an error "Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'x'"

reg_fe <- plm(log(totaldonation) ~ log(txhw), data=new_id, method="within")

Above reg_fe gives me an error in model.matrix.pdata.frame(data, rhs = 1, model = model, effect = effect, : model matrix or response contains non-finite values (NA/NaN/Inf/-Inf)

There's no NA values within my data set, what could I do to resolve these problems?

I've tried using complete cases as below, but am not too sure if it is the right method.

new_id <- new_id[complete.cases(new_id),]
J Lee
  • 11
  • 2
  • 3
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Do you have any 0 amounts in your data? Or negative values? – MrFlick Dec 17 '20 at 02:32
  • 1
    You can only log positiive numbers - `log(0)` is `-Inf`. My guess is your `txhw` variable contains 0s, that become `-Inf` when you log them. For a similar effect, you could use `log1p()`, which is defined as `log(x + 1)`. – Gregor Thomas Dec 17 '20 at 02:35

0 Answers0