12

I am confused with the way predict.glm function in R works. According to the help,

The "terms" option returns a matrix giving the fitted values of each term in the model formula on the linear predictor scale.

Thus, if my model has form f(y) = X*beta, then command

predict(model, X, type='terms')

is expected to produce the same matrix X, multiplied by beta element-wise. For example, if I train the following model

test.data = data.frame(y = c(0,0,0,1,1,1,1,1,1), x=c(1,2,3,1,2,2,3,3,3))
model = glm(y~(x==1)+(x==2), family = 'binomial', data = test.data)

the resulting coefficients are

beta <- model$coef

Design matrix is

X <- model.matrix(y~(x==1)+(x==2), data = test.data)

  (Intercept) x == 1TRUE x == 2TRUE
1           1          1          0
2           1          0          1
3           1          0          0
4           1          1          0
5           1          0          1
6           1          0          1
7           1          0          0
8           1          0          0
9           1          0          0

Then multiplied by coefficients it should look like

pred1 <- t(beta * t(X))

  (Intercept) x == 1TRUE x == 2TRUE
1    1.098612  -1.098612  0.0000000
2    1.098612   0.000000 -0.4054651
3    1.098612   0.000000  0.0000000
4    1.098612  -1.098612  0.0000000
5    1.098612   0.000000 -0.4054651
6    1.098612   0.000000 -0.4054651
7    1.098612   0.000000  0.0000000
8    1.098612   0.000000  0.0000000
9    1.098612   0.000000  0.0000000

However, actual matrix produced by predict.glm seems to be unrelated to this. The following code

pred2 <- predict(model, test.data, type = 'terms')

      x == 1     x == 2
1 -0.8544762  0.1351550
2  0.2441361 -0.2703101
3  0.2441361  0.1351550
4 -0.8544762  0.1351550
5  0.2441361 -0.2703101
6  0.2441361 -0.2703101
7  0.2441361  0.1351550
8  0.2441361  0.1351550
9  0.2441361  0.1351550
attr(,"constant")
[1] 0.7193212

How does one interpret such results?

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
David Dale
  • 10,958
  • 44
  • 73
  • It seems that when predicting `terms` predict uses different contrasts, but none of the built in seem to work. Also, to confirm `all.equal(rowSums(predict(model, test.data, type = 'terms')) + attributes(predict(model, test.data, type = 'terms'))$constant, predict(model, test.data))` – toni057 Jun 22 '16 at 11:05
  • Zheyuan, don't panic so much ;) – David Dale Aug 24 '16 at 05:28

2 Answers2

15

I have already edited your question, to include "correct" way of getting (raw) model matrix, model coefficients, and your intended term-wise prediction. So your other question on how to get these are already solved. In the following, I shall help you understand predict.glm().


predict.glm() (actually, predict.lm()) has applied centring constraints for each model term when doing term-wise prediction.

Initially, you have a model matrix

X <- model.matrix(y~(x==1)+(x==2), data = test.data)

but it is centred, by dropping column means:

avx <- colMeans(X)
X1 <- sweep(X, 2L, avx)

> avx
(Intercept)  x == 1TRUE  x == 2TRUE 
  1.0000000   0.2222222   0.3333333 

> X1
  (Intercept) x == 1TRUE x == 2TRUE
1           0  0.7777778 -0.3333333
2           0 -0.2222222  0.6666667
3           0 -0.2222222 -0.3333333
4           0  0.7777778 -0.3333333
5           0 -0.2222222  0.6666667
6           0 -0.2222222  0.6666667
7           0 -0.2222222 -0.3333333
8           0 -0.2222222 -0.3333333
9           0 -0.2222222 -0.3333333

Then term-wise computation is done using this centred model matrix:

t(beta*t(X1))

  (Intercept) x == 1TRUE x == 2TRUE
1           0 -0.8544762  0.1351550
2           0  0.2441361 -0.2703101
3           0  0.2441361  0.1351550
4           0 -0.8544762  0.1351550
5           0  0.2441361 -0.2703101
6           0  0.2441361 -0.2703101
7           0  0.2441361  0.1351550
8           0  0.2441361  0.1351550
9           0  0.2441361  0.1351550

After centring, different terms are vertically shifted to have zero mean. As a result, intercept will be come 0. No worry, a new intercept is computed, by aggregating shifts of all model terms:

intercept <- as.numeric(crossprod(avx, beta))
# [1] 0.7193212

Now you should have seen what predict.glm(, type = "terms") gives you.

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
  • 2
    Is there a way to get the un-centerized terms with predict()? – PingPong Sep 27 '20 at 15:49
  • 1
    The fact that `predict.lm` does not have an option for uncentered terms is problematic, especially when there are complex interactions, i.e., where one variable interacts with more than one other variable. I wish that uncentered was the default. – Frank Harrell Oct 10 '22 at 14:43
0

This code will compute uncentered term values.

## Extracted from stats::predict.lm (called by predict.Glm for type='terms')

ucmterms <-
  function (object, newdata, terms = NULL, na.action = na.pass, ...) {
    tt <- terms(object)
    if (missing(newdata) || is.null(newdata)) {
      mm <- X <- model.matrix(object)
      mmDone <- TRUE
      offset <- object$offset
    }
    else {
      Terms <- delete.response(tt)
      m <- model.frame(Terms, newdata, na.action = na.action, 
                       xlev = object$xlevels)
      if (!is.null(cl <- attr(Terms, "dataClasses"))) 
        .checkMFClasses(cl, m)
      X <- model.matrix(Terms, m, contrasts.arg = object$contrasts)
      mmDone <- FALSE
    }
       
    beta <- object$coefficients
    nrp <- num.intercepts(object)   # in rms
    ## If > 1 intercept, remove all but one
    if(nrp > 1) beta <- cbind(beta[1], beta[-(1 : nrp)])
       
    if (!mmDone) {
      mm <- model.matrix(object)
      mmDone <- TRUE
    }
    aa <- attr(mm, "assign")
    ll <- attr(tt, "term.labels")
    hasintercept <- attr(tt, "intercept") > 0L
    if (hasintercept) ll <- c("(Intercept)", ll)
    aaa <- factor(aa, labels = ll)
    asgn <- split(order(aa), aaa)
    if(hasintercept) asgn$"(Intercept)" <- NULL

    nterms <- length(asgn)
    if (nterms > 0) {
      predictor <- matrix(ncol = nterms, nrow = NROW(X))
      dimnames(predictor) <- list(rownames(X), names(asgn))
      for(i in seq.int(1L, nterms, length.out = nterms)) {
        ii <- asgn[[i]]
        predictor[, i] <- X[, ii, drop = FALSE] %*% beta[ii]
      }
      if (!is.null(terms)) predictor <- predictor[, terms, drop = FALSE]
    } else predictor <- matrix(0, NROW(X), 0L)

    attr(predictor, "constant") <- 0

    if (missing(newdata) && ! is.null(na.act <- object$na.action))
      predictor <- napredict(na.act, predictor)

    predictor
}
Frank Harrell
  • 1,954
  • 2
  • 18
  • 36