5

I want to incorporate weights into the likelihood to do what the svyglm does with weights.

According to Jeremy Miles and elsewhere, svyglm function uses weights to "weight the importance of each case to make them representative (to each other, ...)".

Here is my data:

(dat <- data.frame(
  A = c(1, 1, 0, 0), B = c(1, 0, 1, 0),
  Pass = c(278, 100, 153, 79), Fail = c(743, 581, 1232, 1731), Weights= c(3, 1, 12, 3)
))

Here is my likelihood function:

ll <- function (b0, b1, b2, b3) {
  odds <- exp(b0) * (1 + b1 * dat$A + b2 * dat$B + b3 * dat$A * dat$B)
  -sum(dbinom(
    x = dat$Pass, size = rowSums(dat[, 3:4]),
    prob = odds / (1 + odds), log = TRUE))
}
double-beep
  • 5,031
  • 17
  • 33
  • 41
Krantz
  • 1,424
  • 1
  • 12
  • 31
  • I can't really see what's the relation between survey weights and, I guess, simply weighted maximum likelihood. Doing `-sum(dat$Weights * dbinom(...` would give more importance to the cases with higher weights. Is there something wrong with that? – Julius Vainora Dec 16 '18 at 15:11
  • I think your solution works. It seems it is what I am looking for. Just let me ask this: Is the likelihood contribution of each observation that is being weighted in your solution? – Krantz Dec 16 '18 at 23:18
  • 1
    Yes, we may interpret it like that. Doubling the log likelihood/contribution of x_i = including x_i twice. – Julius Vainora Dec 16 '18 at 23:26
  • Thank you very much for that @Julius Vainora. – Krantz Dec 16 '18 at 23:27

1 Answers1

3

As it is said in your referred answers, different weights are used differently in different contexts. In your current example I don't really see any populations, while the goal is pretty clear: for each observation to have a specified "importance". Then the weighted maximum likelihood would simply use

ll <- function (b0, b1, b2, b3) {
  odds <- exp(b0) * (1 + b1 * dat$A + b2 * dat$B + b3 * dat$A * dat$B)
  -sum(dat$Weights * dbinom(
    x = dat$Pass, size = rowSums(dat[, 3:4]),
    prob = odds / (1 + odds), log = TRUE))
}

That can indeed be interpreted as including the i-th observation to the sample wi times (this interpretation, of course, is exactly true only for integer weights, while the method works with any weights):

enter image description here

Julius Vainora
  • 47,421
  • 9
  • 90
  • 102