0

So I build a lm model in R on 65OOO rows and I want to see only the predictions for the first ten rows in order to see how good my model predicts. Below you can see the code I wrote to execute this but it keeps predicting the values of all 65000 rows. Is someone able to help me?

test_data <- mydata[1:10,] 
test_data<-subset(test_data,select = -c(24)) #delete column which i try to predict
predict(lm_model109,new=test_data)
  • https://stackoverflow.com/help/minimal-reproducible-example – Baraliuh Mar 07 '22 at 21:59
  • Your code looks fin as for as the information you've provided. Could you (a) verify `nrow(test_data)`, (b) If you assign the result, `pred <- predict(...), verify `length(pred)`, (c) make sure everything is spelled correctly, and (d) provide a bit more code? What modeling function did you use? How did you call it? etc. – Gregor Thomas Mar 07 '22 at 22:00
  • 2
    How did you build you `lm_model109` object. Did you use a formula syntax? Did you have `$` in your formula? That's a common mistake. Really hard to give any specific solutions without a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – MrFlick Mar 07 '22 at 22:13
  • 4
    The second argument to predict should be "newdata" rather than "new". It's possible that the partial matching isn't matching this as you expect because predict.lm also takes the ... argument – Miff Mar 07 '22 at 22:13
  • @GregorThomas can u please answer to the other question I asked on my profile. I am really stuck – Lucasjansens Mar 09 '22 at 15:15
  • Can't really help on that one without a reproducible example. – Gregor Thomas Mar 09 '22 at 15:27
  • @GregorThomas lm_model2002 <- lm(mydata$`AC: Volume` ~ `Market Area (L1)`,data=mydata) summary(lm_model2002) #0.1126 predict(lm_model2002,data.frame(`Market Area (L1)`=="Algeria")) and still it keeps predicting all 65OOO rows – Lucasjansens Apr 12 '22 at 09:23
  • @MrFlick see comment above – Lucasjansens Apr 12 '22 at 09:23
  • @Miff see new comment – Lucasjansens Apr 12 '22 at 09:24
  • 1
    @Lucasjansens please edit new information into your question so it's all presented in one place and formatted nicely - don't just put things in comments. Also editing your question will bump it to the top of the "Active Questions" queue – Gregor Thomas Apr 12 '22 at 14:43
  • You should never have `$` or data.frame names in your formula. Try `lm_model2002 <- lm(\`AC: Volume\` ~ \`Market Area (L1)\`, data=mydata)` – MrFlick Apr 13 '22 at 01:32

0 Answers0