0

Im trying to do an OLS regression and I keep getting an error message that a certain variable cannot be found. I am a newbie to R.

All the code works except for the last line.

load("psub.Rdata")

VarsForOLS.tbl <- psub %>%
  mutate(personalIncome = PINCP, groupingID = ORIGRANDGROUP, age = AGEP, sex = SEX, workingclass = COW, educationalLevel = SCHL) %>%
select(personalIncome, groupingID, age, sex, workingclass, educationalLevel)

trainingIncome.data <- subset(VarsForOLS.tbl, groupingID >=500)
testingIncome.data <- subset(VarsForOLS.tbl, groupingID < 500)

y <- "log(personalIncome, base=10)"
explanatoryVariables <- c("age", "sex", "workingclass", "educationLevel")

olsModel <- paste(y, paste(explanatoryVariables, collapse = "+"), sep = "-")

trainingIncome.ols <- lm(olsModel, data = trainingIncome.data)

I expect to be able to run the linear regression but the error says:

Error in eval(parse(text = x, keep.source = FALSE)[[1L]]) : 
  object 'personalIncome' not found
MTB
  • 1

1 Answers1

1

For the best help you should post a reproducible example.

You are generating your formula with a - which should be a ~. Even better, @benbolker suggested this handy function

olsModel <- reformulate(explanatoryVariables, response="y")

which will automatically parse the character vector and add the y variable as response, so you don't have to worry about tildes and paste and so on.

Generally, if you're stuck on these kinds of things I'd recommend trying the model without all the parameterisation (just type it out!) and seeing if that runs first. Also, try print(olsModel) to see what you've ended up pasting together.