3

I have an exercise that says

Find a confidence interval of 95% on the mean number of games won by a team when x2=2300,x7=56 and x8=2100.

Is there a function in R that gives directly such confidence interval?

I've thought about using the function confint(f), but this function gives the result when it's about one or more parameters, and as far as I understand I don't have a parameter but a function like this beta0+beta1xi where the parameter beta is already estimated and the point xi would be x2,x7 and x8.

Another way would be to do it 'manually' but this complicates because I would have to calculate the standard error,the variance, the t value, etc.

Could you help please?

Thank you in advance

Glen_b
  • 7,883
  • 2
  • 37
  • 48
user9802913
  • 245
  • 4
  • 20
  • In the original data there were the vectors y,x2,x7,x8 – user9802913 Dec 09 '18 at 04:15
  • I'm not seeing a clear statistical question here. You've given us 3 separate integers but you have not told us what they represent. What are x2, x7 and x8? – IRTFM Dec 09 '18 at 05:46
  • @42- x2 is passing yards, x7 is percent rushing and x8 is opponents' rushing yards. Those are data from National football league 1976 team performance (according to the book) – user9802913 Dec 09 '18 at 05:53
  • So it makes no sense to take the mean of x2,x7,and x8. Do you have vectors? Are there lengths for each of x2, x7,amd x8? Are these sums or summaries of longer sets of values? – IRTFM Dec 09 '18 at 16:03
  • @42- Why makes no sense? To your questions, Yes, yes, not that I know. – user9802913 Dec 09 '18 at 21:57
  • Makes no sense to request a mean of three numbers that have such a tenuous relationship, and wouldn't even if they were measured in the same units, which they aren't. – IRTFM Dec 09 '18 at 22:37

2 Answers2

2

You need to look not at confint but predict.lm:

Details

predict.lm produces predicted values, obtained by evaluating the regression function in the frame newdata (which defaults to model.frame(object)). If the logical se.fit is TRUE, standard errors of the predictions are calculated. If the numeric argument scale is set (with optional df), it is used as the residual standard deviation in the computation of the standard errors, otherwise this is extracted from the model fit. Setting intervals specifies computation of confidence or prediction (tolerance) intervals at the specified level, sometimes referred to as narrow vs. wide intervals.`

You'll need to set up a data frame with the same column names as used in the model fit which contains the set values you want a prediction for, for the newdata argument.

Here's an example showing how to use newdata:

x1<-c(1,2,5,6); x2<-c(3,2,4,1); x3<-c(5,4,3,4); y<-c(21,21,27,23)
res<-lm(y~x1+x2+x3)
predict.lm(res,newdata=data.frame(x1=4,x2=4,x3=2),
            interval="confidence")

(i.e. you'll need something of the form data.frame(x2= ..., x7=... etc but where you fill in the values you want)

However, you also need to tell it the type of interval you need.

(predict is the generic; if you call predict on an lm object, it will call predict.lm, but to get the right help you need to look directly at the specific function)

Community
  • 1
  • 1
Glen_b
  • 7,883
  • 2
  • 37
  • 48
  • I don't understand very well. Could you explain with an example? What am I going to write in interval=" "? – user9802913 Dec 09 '18 at 06:54
  • It says type of interval calculation – user9802913 Dec 09 '18 at 06:55
  • What does that mean? – user9802913 Dec 09 '18 at 06:55
  • The default (response) is what you need. Don't specify anything for it – Glen_b Dec 09 '18 at 08:54
  • Oh then why did you write: You'll need to set up a data frame with the same column names as used in the model fit which contains the set values you want a prediction for, for the newdata argument. ? – user9802913 Dec 09 '18 at 21:54
  • Because that's what you need for the `newdata` argument. Why would what you need for the `interval` argument (the default, so leave that blank) impact what you need for the `newdata` argument? – Glen_b Dec 09 '18 at 22:25
  • Would be predict(f), where f is the linear model (f<-y∼x2+x7+x8)? – user9802913 Dec 09 '18 at 22:44
  • No, I think would be predict(f,x2,x7,x8) where `newdata` is x2,x7,x8? – user9802913 Dec 09 '18 at 22:48
  • Comments are not for extended discussion. You don't have a `newdata` argument there. See the syntax for the `newdata` argument in `predict.lm`. Your data frame supplied to the `newdata` argument must have the same names as the variables in the regression call. – Glen_b Dec 09 '18 at 22:55
  • See also the first two examples in the help on `predict.lm` – Glen_b Dec 09 '18 at 23:02
  • Notice in the first example in the help on `predict.lm` the new parameter is defined as new<-data.frame(x=-3,3,.5) where x is the regression variable in the model. How would I write it if I have 3 regression variables? – user9802913 Dec 09 '18 at 23:22
  • type `?data.frame` at the console to see how to create a data frame (then re-read my answer for details). Again, comments are not for extended discussion and *really* not for me to teach you the basics of how to use R; there are many tutorials for that. – Glen_b Dec 09 '18 at 23:27
  • Specifically see the first example in the help on data.frame; there's also lots of help online for how to create them, with many examples. If you have a new question about creating data frames, you can post it, of course (if it hasn't already been asked). A piece of advice: if in future you write your questions with a [small reproducible example](https://stackoverflow.com/a/5963610/330679), it makes it easier for answerers to give an explicit solution on the example data – Glen_b Dec 09 '18 at 23:33
  • I added some more specific detail to the answer. – Glen_b Dec 09 '18 at 23:40
  • @Ben thanks, but the intent was for that to be as it was -- that the entire quote after the heading was in backticks. However, I will leave it as is now. – Glen_b Dec 10 '18 at 01:55
  • oh, sorry. Why? Seems harder to read to me ... I would opt for *either* block-quote *or* code formatting, not both at once? – Ben Bolker Dec 10 '18 at 02:43
  • I was trying to make it look more or less close to how it shows up when doing `help(predict.lm,help="text")` (i.e. getting it in the R session rather than in a browser, which is how I am used to reading it). I don't suppose that's a particularly compelling reason, which is why I didn't revert. – Glen_b Dec 10 '18 at 02:58
  • It was not the default response what I need but the interval="confidence" : predict.lm(res,newdata=data.frame(x1=4,x2=4,x3=2),interval="confidence") – user9802913 Dec 11 '18 at 19:09
  • @Isa Sorry, I misunderstood your question -- where you said "type of interval calculation" I thought you were referring to the "type" argument. I see now you were referring to the interval argument in the comment above it. I had mentioned the need to specify the interval type before so I didn't think you could be referring to that (nevertheless, rereading now, clearly you did) – Glen_b Dec 11 '18 at 23:32
  • :) this one predict.lm(res,newdata=data.frame(x1=4,x2=4,x3=2),interval="confidence") instead of predict.lm(res,newdata=data.frame(x1=4,x2=4,x3=2)) – user9802913 Dec 12 '18 at 20:01
  • Hmm, I don't know why I didn't see that comment before. Odd. – Glen_b Apr 13 '19 at 14:57
2

Yes.

There a function in R that gives directly such confidence interval.

Just type

predict.lm(f,newdata=data.frame(x2=2300,x7=56,x8=2100),interval="confidence")

Where f is the linear model, i.e. f<-lm(y~x2+x7+x8)

where y,x2,x7,x8 are your particular vectors.


As a side note, note that this function can also give the "prediction" interval, just change "confidence" by "prediction".

bbiasi
  • 1,549
  • 2
  • 15
  • 31
user9802913
  • 245
  • 4
  • 20