-1

My data set is y. I have an ID and Sales column. I would like to ad a 3 column that has the percentile of each employee based on their sales.

The formula for the percentile is:

Percentile Employee(i) = (Number of employees with less sales)/(Total employees-1)

Thanks

Braiam
  • 1
  • 11
  • 47
  • 78
user1783504
  • 331
  • 4
  • 7
  • 14
  • 1
    Please read [this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) and update your question accordingly. Please also show us what you have tried. – Henrik Sep 14 '13 at 19:16

2 Answers2

3

Using your formula, consider the following fake data solution:

#fake data
y <- data.frame(
            #20 fake ids
            id = seq(1,20),
            #20 fake sales between 10000 and 15000  
            sales = runif(20, 10000, 15000))

#define an employee count
emp_cnt <- length(y$id)
#rank your sales
y$rank <- rank(y$sales,ties.method="min")
#subtract each rank from one (i.e. lowest rank) and divide by one minus emp_cnt
y$percentile <- (y$rank - 1)/(emp_cnt - 1)
joemienko
  • 2,220
  • 18
  • 27
0

Use this:

within(y[order(y$sales), ], p <- with(rle(sales), rep(c(0, head(cumsum(lengths), -1)), lengths))/(length(ID)-1))

Example output:

   ID sales         p
4   4     3 0.0000000
6   6     3 0.0000000
11 11     3 0.0000000
19 19     3 0.0000000
20 20     3 0.0000000
3   3     4 0.2631579
13 13     4 0.2631579
17 17     4 0.2631579
18 18     4 0.2631579
2   2     5 0.4736842
8   8     5 0.4736842
10 10     5 0.4736842
12 12     5 0.4736842
16 16     5 0.4736842
9   9     6 0.7368421
5   5     7 0.7894737
7   7     7 0.7894737
15 15     7 0.7894737
1   1     8 0.9473684
14 14     8 0.9473684

Data used:

   ID sales
1   1     8
2   2     5
3   3     4
4   4     3
5   5     7
6   6     3
7   7     7
8   8     5
9   9     6
10 10     5
11 11     3
12 12     5
13 13     4
14 14     8
15 15     7
16 16     5
17 17     4
18 18     4
19 19     3
20 20     3
Ferdinand.kraft
  • 12,579
  • 10
  • 47
  • 69