-2

I have been trying to do a simple regression in R using the following syntax:

Syntax

Unfortunately, R keeps giving me warnings and the summary is not possible:

Warnings

I can't find out the problem. The data includes more than just the 11 predictors mentioned in the syntax.

Thank you! Melanie

duckmayr
  • 16,303
  • 3
  • 35
  • 53
  • 1
    Hey, Melixy13! That is not actually a problem or an error. These are just warning messages. Also, next time when you will ask question on Stackoverflow, read this article, please - https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – rg4s Aug 14 '20 at 14:05
  • The message says that the response is a factor, maybe a Poisson or negative binomial regression is better suited for the problem (`glm`). Note that the second message says pretty much the same thing, the residuals are computed with a subtraction, `resp - pred`. – Rui Barradas Aug 14 '20 at 14:08
  • Try `class(PTBS_phq_imputed$phq_sum_last)` if the answer is a "factor" there's your problem as @Rui points out. What to do about it depends on what sort of variable it really is. If you believe you can treat it like a number because it is a "sum" and has the properties of a number then `PTBS_phq_imputed$phq_sum_last <- as.numeric(as.character(PTBS_phq_imputed$phq_sum_last))` should solve it if it is truly a factor then follow @Rui's advice and choose a different model like `glm` – Chuck P Aug 14 '20 at 14:18
  • For a better explanation make a [reproducible example](https://stackoverflow.com/a/5963610/11570343) of your data using `dput` – cdcarrion Aug 14 '20 at 15:51

1 Answers1

0

This answer partially consists of comments in the original question.


That is not an error. It's a warning message (it differs from error). It's generated because you attempt to use lm() for a factor-type response variable. Operations like + and - does not work on factor, hence the message "-" not meaningful for factors.

If the response is truly a categorical variable, lm() might not be the right way to go to model it. Alternatives in this situation:

  • glm(): Binary logistic regression, Poisson regression, negative binomial regression
  • MASS::polr(): Ordinal logistic regression
  • nnet::multinom(): Multinomial logistic regression
  • and many more others.

Please research the corresponding methods before actually using it.

If the response is actually NOT a categorical variable, you will want to look further why it is coded as a factor, and turn it to numeric first.

Nuclear03020704
  • 549
  • 9
  • 22