37

I want to mimic the Excel equivalent PERCENTILE function in C# (or in some pseudo code). How can I do that? The function should take two arguments where the first is a list of values and the second is for what percentile the function should calculate for.

Tanks!

Edit: I'm sorry if my question came across like I had not tried it my self. I just couldn't understand how the excel function worked (yes, I tried wikipedia and wolfram first) and I thought I would understand it better if someone presented it in code. @CodeInChaos gave an answer that seem to be what I'm after.

picknick
  • 3,897
  • 6
  • 33
  • 48
  • could you give a pseudocode example of function usage, input data and expected output result? – sll Nov 15 '11 at 13:54

3 Answers3

44

I think Wikipedia page has formulas you need to write your own function...
I tried this:

public double Percentile(double[] sequence, double excelPercentile)
{
    Array.Sort(sequence);
    int N = sequence.Length;
    double n = (N - 1) * excelPercentile + 1;
    // Another method: double n = (N + 1) * excelPercentile;
    if (n == 1d) return sequence[0];
    else if (n == N) return sequence[N - 1];
    else
    {
         int k = (int)n;
         double d = n - k;
         return sequence[k - 1] + d * (sequence[k] - sequence[k - 1]);
    }
}

EDITED after CodeInChaos comment:
Excel uses a percentile value between 0 and 1 (so I changed my code to implement this with Wikipedia formulas) and the other method to calulate n (so I changed the commented one).

Marco
  • 56,740
  • 14
  • 129
  • 152
  • @picknick: I've just written the corresponding function from Wikipedia page. This should be the one used (according to Wikipedia) by Excel. – Marco Nov 15 '11 at 14:22
  • 2
    Two issues: 1) Excel seems to use the (N-1) method 2) It represents the percentile by a number between 0 and 1. – CodesInChaos Nov 15 '11 at 14:36
  • @CodeInChaos: thanks for your comment. I don't know how Excel works, I've just translated Wikipedia formulas in C#, assuming the part related to Excel was correct. What do you think? Are those wrong? I think OP needed _"a way"_ to calc percentile, even because there are many methods. Anyway your point of view matters for me. Let me know something. Thanks! :) – Marco Nov 15 '11 at 14:43
  • @CodeInChaos: I changed my code to reflect what you suggested after verified what Excel does. Thanks a lot :) – Marco Nov 15 '11 at 14:50
  • You misread wikipedia. The `(N+1)` formula is already part of the next section that talks about NIST Percentages. I guess the `/100` part is probably for consistency with the rest of the page. Also the `+1` part seems to be used because Excel arrays are 1 indexed, whereas C# uses 0 based arrays. – CodesInChaos Nov 15 '11 at 14:52
  • @CodeInChaos: I tried my code with Wikipedia formulas and Excel and both correspond... Wikipedia uses percentile P from 1 to 100, so /100 is for that I think. Am I wrong? ;) – Marco Nov 15 '11 at 14:55
  • Your code rounds the `percentile` parameter down to the nearest multiple of 0.01. Not sure if excel does such a thing, but find such behavior surprising. (For the same reason I dislike `TimeSpan.FromSeconds`) – CodesInChaos Nov 15 '11 at 15:00
  • @CodeInChaos: you're right, I checked Excel and this rounding does not occurs. Edited again... and thank you again!! :) – Marco Nov 15 '11 at 15:07
  • It's throwing me an ArgumentOutOfRangeException at the last line of the else block (`return sequence[k - 1] + d * (sequence[k] - sequence[k - 1]);`) – lost_in_the_source Nov 07 '13 at 01:13
  • 1
    @EdwardKarak: show us your sequence and percentile passed to the function, so we can try to help... – Marco Nov 07 '13 at 06:43
  • I solved the problem. I misread the statement "Excel uses a percentile value between 0 and 1", and I was entering numbers greater than 1. – lost_in_the_source Nov 07 '13 at 23:21
  • the compiler gives a warning due to the == comparison of doubles. In order to workaround I suggest replacing n==1 with n<=1, idem for n==N as follows: if (n <= 1) return sequence[0]; else if (n >= N) return sequence[N - 1]; – shelbypereira Dec 10 '16 at 12:41
27

Trying to reproduce the results at: http://www.techonthenet.com/excel/formulas/percentile.php I came up with:

public static double Percentile(IEnumerable<double> seq,double percentile)
{
    var elements=seq.ToArray();
    Array.Sort(elements);
    double realIndex=percentile*(elements.Length-1);
    int index=(int)realIndex;
    double frac=realIndex-index;
    if(index+1<elements.Length)
        return elements[index]*(1-frac)+elements[index+1]*frac;
    else
        return elements[index];
}

(Does not handle NaN and infinities).

A few test cases:

Percentile(new double[]{1,2,3,4}, 0.8).Dump();// 3.4
Percentile(new double[]{7,8,9,20}, 0.35).Dump();// 8.05
Percentile(new double[]{1,2,3,4}, 0.3).Dump();// 1.9
CodesInChaos
  • 106,488
  • 23
  • 218
  • 262
0

Add the values to a list, sort that list, and take the index value ceil(length of the list * percentile).

stracktracer
  • 1,862
  • 5
  • 24
  • 37