summary.rq output depends on sample size

Question

I found out (see below) that the function summary.rq (page 88) from the quantreg package prints different output, depending on whether sample size is greater equal or less than 1001.

I am aware, that rq() uses a different method depending on whether sample size is greater equal or less than 1001. I assume that this is the reason for this behaviour.

An MWE showing the described behaviour:

> library(quantreg)
> x <- seq(0, 100, length.out = 1000)     
> e <- rnorm(1000, mean = 0, sd = 1.2)
> y <- 6 + 0.1 * x + e  
> summary(rq(y ~ x, tau = 0.025))

Call: rq(formula = y ~ x, tau = 0.025)

tau: [1] 0.025

Coefficients:
            coefficients lower bd upper bd
(Intercept) 3.67733      2.92776  3.88165 
x           0.10061      0.09578  0.10675 
Warning message:
In rq.fit.br(x, y, tau = tau, ci = TRUE, ...) : Solution may be 
nonunique

> x <- seq(0, 100, length.out = 1001)     
> e <- rnorm(1001, mean = 0, sd = 1.2)
> y <- 6 + 0.1 * x + e  
> summary(rq(y ~ x, tau = 0.025))

Call: rq(formula = y ~ x, tau = 0.025)

tau: [1] 0.025

Coefficients:
            Value    Std. Error t value  Pr(>|t|)
(Intercept)  3.61744  0.28052   12.89559  0.00000
x            0.10033  0.00477   21.04017  0.00000

Is this desired behaviour? How can I get the first form of output for sample sizes larger than 1000?

My issue is that I am using the summary.rq function in a loop over multiple sample sizes and would like to use the lower and upper band values.

Welcome to CV. The question is not about statistics or its methodology; but rather about using a certain software package. Please consider re-posting it to a different/more fitting place. — cherub, Dec 13 '18 at 14:38
will repost in 90 minutes on stackoverflow, thanks for the hint — Buochserhorn, Dec 13 '18 at 14:46
First i'd advice you to set.seed() so your variables are identical. Second, you can specify the method in rq. For samples <1000 rq uses method="br". If you want to apply that for larger samples simply specify the method. Third, note that even if you don't change the sample size (i.e re-run first 4 lines a few times) you won't get exactly the same numbers. This is because results represent estimates rather than products of closed form solution (i think). This means that the differences between your experiments might not be due to different rq method — JustGettinStarted, Dec 13 '18 at 17:19
@JustGettinStarted Thanks for the hints, especially the set.seed()! But my problem are not the numbers. My problem is the different format of the output. — Buochserhorn, Dec 13 '18 at 17:48
you mean the "Warning message: In rq.fit.br(x, y, tau = tau, ci = TRUE, ...) : Solution may be nonunique" part? If so you can turn warnings off https://stackoverflow.com/questions/16194212/how-to-suppress-warnings-globally-in-an-r-script. But be careful, it may be better to save the outputs to a list and then just print the result after your loop is finished. Warnings wont show then — JustGettinStarted, Dec 13 '18 at 18:05
@JustGettinStarted No, my problem is, I do call exactly the same function, once I get lower bd and upper bd, once I get Std. Error and t value. I want to have the same ouput both times. — Buochserhorn, Dec 14 '18 at 08:49

score 2 · Accepted Answer · answered Dec 16 '18 at 15:23

The difference between the outputs is not coming from rq() but rather summary.rq(). quantreg uses "rank" inference methods for sample sizes smaller than 1000 otherwise "nid" is used. The help file indicates that the "rank" methods can be extremely slow for larger sample sizes. If you insist on having the former output appear for all your looped studies then you can specify

summary(rq(y ~ x, tau = 0.025),se="rank")

However, you may be better served by looking into this more closely. For example if some of your studies have very large sample sizes the computation may become extremely slow, you may want to specify se="nid" for all your studies instead and compute the upper and lower bands manually (upper=Value + 1.96*Std. Error and lower=Value - 1.96*Std. Error.

if you found this answer useful, please select 'answered', if not please indicate what else you were looking for and maybe we can get this question resolved. — JustGettinStarted, Dec 23 '18 at 18:15

summary.rq output depends on sample size

1 Answers1