1

I have a requirement in Excel to spread small; i.e. pennies, monetry rounding errors fairly across the members of my club.

The error arises when I deduct money from members; e.g. £30 divided between 21 members is £1.428571... requiring £1.43 to be deducted from each member, totalling £30.03, in order to hit the £30 target.

The approach that I want to take, continuing the above example, is to deduct £1.42 from each member, totalling £29.82, and then deduct the remaining £0.18 using an error spreading technique to randomly take an extra penny from 18 of the 21 members.

This immediately made me think of Reservoir Sampling, and I used the information here: Random selection, to construct the test Excel spreadsheet here: https://www.dropbox.com/s/snbkldt6e8qkcco/ErrorSpreading.xls, on Dropbox, for you guys to play with...

The problem I have is that each row of this spreadsheet calculates the error distribution indepentently of every other row, and this causes some members to contribute more than their fair share of extra pennies.

What I am looking for is a modification to the Resevoir Sampling technique, or another balanced / 2 dimensional error spreading methodology that I'm not aware of, that will minimise the overall error between members across many 'error spreading' rows.

I think this is one of those challenging problems that has a huge number of other uses, so I'm hoping you geniuses have some good ideas!

Thanks for any insight you can share :)

Will

Community
  • 1
  • 1
William Bell
  • 182
  • 2
  • 10
  • If you don't want it to be random, then you can just cycle through your member list, looping back to the beginning when you reach the last member. It does mean you need to remember the last person selected for the extra penny, but that should be fairly straightforward. – Tim Williams Jun 07 '13 at 22:37
  • Hi, thanks for replying! I've considered this approach, but a random error distribution method fits better with the way the spreadsheet has been designed. – William Bell Jun 08 '13 at 14:17
  • I've tried modifying the Reservoir Sampling algorithm by weighting the random number based on the cumulative value of the errors previously distributed to a member. Having tried several method of calculating the weighting, I just seem to break the RS algorithm, causing fewer than the required number of pennies being distributed on that row. Since probabilities are used to calculate a horizontal error distribution, the same must be true in the vertical direction too. A way to combine these two probabilities would then be needed. I am struggling with both of these steps. – William Bell Jun 08 '13 at 14:51

1 Answers1

0

I found a solution. Not very elegant, through. You have to use two matrix. In the first you get completely random number, chosen with =RANDOM() and in the second you choose the n greater value Say that in F30 you have the first

=RANDOM()

cell. (I have experimented with your sheet.)

Just copy a column of n (in your sheet 8) in column A) In cell F52 you put:

=IF(RANK(F30,$F30:$Z30)<=$A52, 1, 0)

Until now, if you drag left and down the formulas, you have the same situation that is in your sheet (only less elegant und efficient).

But starting from the second row of random number you could compensate for the penny esbursed.

In cell F31 you put:

=RANDOM()-SUM(F$52:F52)*0.5

(pay attention to the $, each random number should have a correction basated on penny already spent.) If the $ are ok you should be OK dragging formulas left and down. You could also parametrize the 0.5 and experiment with other values. With 0,5 I have a error factor (the equivalent of your cell AB24) between 1 and 2

momobo
  • 1,755
  • 1
  • 14
  • 19
  • hi momobo, thanks for taking a look at this. I think your idea definitely has potential; I will test it out by updating the dropbox spreadsheet and post a further update on here tomorrow. Like you say, it does lack some of the elegance of the RS approach, so that can only mean 1 thing; this challenge is still OPEN! :) – William Bell Jun 08 '13 at 22:25
  • hi again momobo. I have implemented your idea in the spreadsheet here: https://www.dropbox.com/s/v01y59jq07ds0ac/ErrorSpreading%20-%20momobo%20idea.xls, but I'm unable to replicate your resulting error of between 1 and 2. In my spreadsheet, the 0.5 bias is too large and, after a few rows, is swamping the value generated by the rand(); casuing the rank() to lock onto a few columns and only select them. What number range is your random() function generating? – William Bell Jun 09 '13 at 12:26
  • I have now replicated the idea that momobo outlined in his post above via the spreadsheet here: https://www.dropbox.com/s/plvy4mu23lvktlm/ErrorSpreading%20-%20momobo%20idea%202.xls. This does work pretty well, (thanks momobo!), but I still wonder if any of you have any thoughts on a 2D Reservoir Sampling methodology? I won't mark this post as solved just yet... :) – William Bell Jun 09 '13 at 16:25
  • 1
    Ok, I'm going to accept the principle behind this answer - thanks momobo. I have made a couple of small changes though... Firstly, I'm using: =rand()+sum(...)*0.75. The '+' ensures a tables of +ve values, while the '0.75' pretty much guarantees an error factor of 1. The '+' in that formula then requires the rank() formula to have the usually optional order parameter; this should be set to 1. – William Bell Jun 11 '13 at 20:58