0

I have a slight problem with my R coursework.

I have made a following dataset:

Dataset

Now I'm going to plot the values based on this dataset using the following command:

plot(x ~ Group.1, data = jarelmaks_vaikelaen23mean, 
    xlab = "Vanus", ylab = "PD", main = "Järelmaks ja väikelaen")

After that, I'm creating a glm model using the following command. The difference is, that now I'm using an original dataset (the values of the dependent values are 1/0).

The original dataset

GLM command:

jarelmaks_vaikelaen23_mudel <- glm(Default ~ Vanus.aastates + Toode, 
    family = binomial(link = 'logit'), data = jarelmaks_vaikelaen_23)

Now, I'm trying to predict the values using my model.

predict(jarelmaks_vaikelaen23_mudel,data.frame(Vanus.aastates=x),type = "resp")

Unfortunately, I get a following error message:

Error in data.frame(Vanus.aastates = x) : object 'x' not found

Can you give me some ideas, how to solve this problem or explain, how this predict() command works or smth?

Martin Schmelzer
  • 23,283
  • 6
  • 73
  • 98
Martiiin
  • 41
  • 1
  • 7
  • 1
    You have the line `data.frame(Vanus.aastates=x)`. What is `x` supposed to be in that case? Where are the new values for `Vanus.aastates` you want to use for prediction? Also, your model contains a `Toode` term so you'll need to provide values for that variable as well in order to make predictions. – MrFlick Apr 16 '18 at 15:01
  • Well, yeah .. this "data.frame(Vanus.aastates=x)" is something I found before and it worked using a model with one independent variable. – Martiiin Apr 16 '18 at 15:06
  • And all this "value providing" stuff is sth I don't understand at the moment. Not sure, how to write it down. – Martiiin Apr 16 '18 at 15:08
  • 2
    When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. Pictures of data are not helpful. – MrFlick Apr 16 '18 at 15:09
  • Any well-explained article or relevant site would be of great help! :) – Martiiin Apr 16 '18 at 15:10
  • Basically, what I just want to do, is the same as in this example below. BUT, with 2 variables: plot(bodysize,survive,xlab="Body size",ylab="Probability of survival") g=glm(survive~bodysize,family=binomial,dat) curve(predict(g,data.frame(bodysize=x),type="resp"),add=TRUE) points(bodysize,fitted(g),pch=20) #then par(new=TRUE) # plot(AggBd$Group.1,AggBd$x,pch=30) https://stackoverflow.com/a/10571737/9607921 – Martiiin Apr 16 '18 at 15:26
  • For instance: g=glm(survive~bodysize + HEIGTH,family=binomial,dat) – Martiiin Apr 16 '18 at 15:28

1 Answers1

0

When you provide a data-frame to the predict function's newdata argument, the data-frame should have column names that match the variables used as independent variables in your model-fitting step. That is, your predict call should look like

predict(
    jarelmaks_vaikelaen23_mudel,
    newdata = data.frame(
        Vanus.aastates = SOMETHING,
        Toode = SOMETHING_ELSE
        ),
    type = "response"
    )
Russ Hyde
  • 2,154
  • 12
  • 21
  • "the data-frame should have column names that match the variables used as independent variables in your model-fitting step." So can I just write "Vanus.aastates = Vanus.aastates" and "Toode = Toode"? – Martiiin Apr 16 '18 at 15:36
  • Or I don't understand very well, which of the variables have to match .. – Martiiin Apr 16 '18 at 15:36
  • I was assuming you were predicting on a separate dataset from that used to set up the model. If you want to get the predicted values for your original data you don't have to provide a `newdata`. – Russ Hyde Apr 16 '18 at 15:46