0

I'm attempting to run a linear regression in R, but get the following errors:

Warning messages:
1: In model.response(mf, "numeric") :
 using type = "numeric" with a factor response will be ignored
2: In Ops.factor(y, z$residuals) : ‘-’ not meaningful for factors

The code is:

reg_ex1 <- lm(V45~TotalScore,data = Combineddatainprogresscsv)

Both values, V45, and TotalScore are numerical. A Google search yielded a similar question where it was suggested that the csv file might have commas. But I'm not an expert so don't know how to check this?

Thank you!

There are 1300 lines, so here is just the final part of the output. Let me know if you need more.

    "50", "60", "70", "80", "90", "Compared to others who may have taken this test, how well do you think you scored? - 1"
), class = "factor"), V46 = structure(c(23L, 6L, 4L, 22L, 
4L, 8L), .Label = c("", "0", "1", "10", "11", "12", "13", 
"14", "15", "16", "17", "18", "19", "2", "20", "3", "4", 
"5", "6", "7", "8", "9", "Score"), class = "factor"), TotalScore = c(0L, 
12L, 10L, 9L, 10L, 14L)), row.names = c(NA, 6L), class = "data.frame")
  • 2
    It would help to see your data in an unambiguous format. Can you paste the output from `dput(head(Combineddatainprogresscsv))`? – r2evans Feb 05 '19 at 22:22
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. How did you check that V45 and Total Score were numeric? Did you verify their `class()`? The error message suggests they may just *look* numeric. – MrFlick Feb 05 '19 at 22:22
  • How did you read your data? `read.csv`? If yes try `read.csv2`. – Douglas Mesquita Feb 05 '19 at 22:23
  • I'm very new to R Studio, so I don't know how to verify 'class()'. Also, this is how I read the data - I just changed it to csv2 on your advice, but the error persists. EUData <- read.csv2("combineddatainprogresscsv.csv", stringsAsFactors = FALSE) – Paul Matthews Feb 05 '19 at 23:02

1 Answers1

0

It seems your response variable V46 is a factor. You can see it in the output you pasted: V46 = structure(c(23L, 6L, 4L, 22L, 4L, 8L), .Label = c("", "0", "1", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "2", "20", "3", "4", "5", "6", "7", "8", "9", "Score"), class = "factor")

I would suggest converting V46 to character, then to numeric and finally filter out the missing values which will be produced by the "Score" level.

You should definitely listen to the people in the comments so it's easier to help you :)

Felipe Gerard
  • 1,552
  • 13
  • 23
  • Thanks. I've converted all to numeric. I just now need to omit all of the rows which have a value of '0' in the column 'Total Score'. How do I do this? Can I do something like: if cell in column total score contains 0, omit this entire row from the regression? Thank you. – Paul Matthews Feb 11 '19 at 09:15
  • This is a separate question. Please mark the answer as valid (tick on the left) if it answered your question. To omit the zeroes you can do something like `Combineddatainprogresscsv_nozeroes <- Combineddatainprogresscsv[Combineddatainprogresscsv$TotalScore != 0, ]` and use that instead of the original dataset. – Felipe Gerard Feb 11 '19 at 20:23