1

I have an array containing anywhere from 10 000 to 50 000 elements, representing values at risk in stressed economies. I am interested in calculating the x-th quantile for this array, using an ordinary weighting.

Immediately getting to my question - is it possible to determine a quantile for a large unsorted array without first sorting and then simply indexing? Or perhaps implementing functionality while sorting to also determine some quantile? Speed is of the utmost importance in my case, but a slower method not relying on first sorting would also be interesting to me.


Doing so the conventional way has been quite simple, by first sorting the array and then building SetWeights() to do a bit of simple interpolation (Alpha is the desired quantile fraction)

protected sealed override void SetWeights()
{
    double n = (NumberOfScenarios - 1) * Alpha + 1;

    if (Math.Abs(n - 1d) < double.Epsilon)
    {
        Weights = new List<double> { 1.0 };
        Indices = new List<int> { 0 };
    }
    else if (Math.Abs(n - NumberOfScenarios) < double.Epsilon)
    {
        Weights = new List<double> { 1.0 };
        Indices = new List<int> { NumberOfScenarios - 1 };
    }
    else
    {
        int k = (int)n;
        double d = n - k;

        Weights = new List<double> { 1.0 - d, d };
        Indices = new List<int> { k - 1, k };
    }
}

And then calculating the quantile by taking the respective indices for the weights

public double Quantile(List<double> sortedScenarios)
{
    var varEstimator = 0.0;
    for (var i = 0; i < Indices.Count; ++i)
    {
        varEstimator += Weights[i] * sortedSequence[Indices[i]];
    }
    return varEstimator;
}
M. Schena
  • 2,039
  • 1
  • 21
  • 29
Eric Hansen
  • 1,749
  • 2
  • 19
  • 39
  • 1
    I think not. Calculating of a quantile is based on a sorted array. Maybe you should sort the array during input... – Zachi Shtain Feb 02 '16 at 10:06
  • How are you sorting the array? – toadflakz Feb 02 '16 at 10:12
  • Technically, you can compute quantile with `O(n)` asymptotic while sorting is `O(n*log(n))`; but, IMHO, it's a *small* improve on 50000 items, so sort the array. http://stackoverflow.com/questions/251781/how-to-find-the-kth-largest-element-in-an-unsorted-array-of-length-n-in-on – Dmitry Bychenko Feb 02 '16 at 10:14
  • @toadflakz Just using the standard array sort. Sorting on input isn't possible in my case unfortunately, but certainly worth noting anyways. – Eric Hansen Feb 02 '16 at 10:14

1 Answers1

0

Consider the quicksort algorithm. It uses a pivot element to divide a set in two: one with elements smaller than the pivot element and one with bigger elements. It then goes on to sort each smaller set.

If you are only interested in finding a particular quantile, then any subset that lies entirely outside the quantile does not need to be sorted further. In fact, if you don't need the quantile to be sorted, then only the subsets that contain the quantile borders need to be sorted.

So I suggest using a modified quicksort that only sorts the subsets that contain the lower and upper border of the quantile.

Klas Lindbäck
  • 33,105
  • 5
  • 57
  • 82