1

I am trying to extract pairwise differences when calculating quantile regression in the R software (v 4.2.1). The emmeans package (I am using version 1.8.1-1) should allow me to extract these differences if the median is of interest, but I will need to calculate them for the other percentiles, so I wanted to extract them manually. However, I can't get the same results as with emmeans, and I couldn't find the solution in the vignettes or previous posts on emmeans.

Here is the situation: I have three variables. var1 and var2 are categorical with two levels (A and B, and High and Low, respectively). The other variable has been scaled (with a mean of 0 and a standard deviation of 1) so that the estimates represent the effect when this variable is averaged. var1 is interacting with var2 and var3_z. I then compared the estimates with the output of emmeans, especially the first one, as the interpretation is direct. As you can see (code below), the difference between A and B when var2 is "Low" is 1.36, yet emmeans says it is 1.3 (this is not a rounding problem, in other cases it seems to work perfectly, which means I don't understand the summary correctly).

Here is the code, and outputs.

The summary results

modelAll50 <- rq(output ~ var1 * var2 + var1 * var3_z, tau = 0.5, data = dfModelAllControl, method = "fn")
summary(modelAll50)

Call: rq(formula = output ~ var1 * var2 + var1 * var3_z, tau = 0.5, 
    data = dfModelAllControl, method = "fn")

tau: [1] 0.5

Coefficients:
               Value    Std. Error t value  Pr(>|t|)
(Intercept)     0.04322  0.01623    2.66359  0.00774
var1B           1.36359  0.19793    6.88936  0.00000
var2High        0.11678  0.04986    2.34223  0.01919
var3_z         -0.02829  0.01237   -2.28627  0.02226
var1B:var2High  6.60083  0.65356   10.09977  0.00000
var1B:var3_z   -0.18197  0.21099   -0.86245  0.38846

The emmeans results

em <- emmeans(modelAll50, pairwise ~ var1 | var2)
pairs(em) %>%  confint()

var2 = Low:
 contrast estimate    SE    df lower.CL upper.CL
 A - B        -1.3 0.207 10023    -1.70   -0.895

var2 = High:
 contrast estimate    SE    df lower.CL upper.CL
 A - B        -7.9 0.626 10023    -9.13   -6.673

Results are averaged over the levels of: var3_z 
Confidence level used: 0.95 

I don't have this problem when var3 is not put in interaction with var1, or if interacting with any other variable than var1. Could someone please explain what I am doing wrong, either in my understanding of the summary, or in my manual scaling and comparisons, or with emmeans?

Ore
  • 11
  • 1
  • Can you make your post [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by providing your dataset by using `dput(dfModelAllControl)`? – jrcalabrese Nov 21 '22 at 01:45

1 Answers1

1

First, as is documented, one of the optional arguments is tau its default, according to the documentation, is

tau = "0.5" (must match an entry in object$tau)

This allows you to work with quantiles other than the median, but you have to specify the ones you want when you fit the model.

Second, the code you showed asks for pairwise comparisons twice -- via the pairwise in the formula and again in the call to pairs(). I suggest you routinely skip that pairwise ~ convenience feature, and just get what you want, when you want it.

modelAll <- rq(output ~ var1 * var2 + var1 * var3_z, 
               tau = c(0.25, 0.5, 0.75), 
               data = dfModelAllControl, 
               method = "fn")

em25 <- emmeans(modelAll, ~ var1 | var2, tau = 0.25)
pairs(em25) |>  confint()

em50 <- emmeans(modelAll, ~ var1 | var2, tau = 0.50)
pairs(em50) |>  confint()

em75 <- emmeans(modelAll, ~ var1 | var2, tau = 0.75)
pairs(em75) |>  confint()

Third, you have var3_z interacting with var1, which means that the levels of var1 compare differently depending on the value of var3_z. And var3_z is a numerical predictor. If you don't specify, emmeans chooses to set var3_z to its mean. You can use at to specify different value(s), for example

em75a <- emmeans(modelAll, ~ var1 | var2 * var3_z, tau = 0.75,
    at = list(var3_z = c(-1,0,1))
Russ Lenth
  • 5,922
  • 2
  • 13
  • 21
  • Specifying var_z = 0 solved the problem, although it was indicated that by default the mean of var_z was used, thus it should have been 0. Thank you! – Ore Nov 22 '22 at 13:53
  • OK. Well it's easy enough to check the value of `mean(var_z)` – Russ Lenth Nov 23 '22 at 14:34