Distributing integers using weights? How to calculate?

Question

I need to distribute a value based on some weights. For example, if my weights are 1 and 2, then I would expect the column weighted as 2 to have twice the value as the column weighted 1.

I have some Python code to demonstrate what I'm trying to do, and the problem:

def distribute(total, distribution):
    distributed_total = []
    for weight in distribution:
        weight = float(weight)
        p = weight/sum(distribution)
        weighted_value = round(p*total)
        distributed_total.append(weighted_value)
    return distributed_total

for x in xrange(100):
    d = distribute(x, (1,2,3))
    if x != sum(d):
        print x, sum(d), d

There are many cases shown by the code above where distributing a value results in the sum of the distribution being different than the original value. For example, distributing 3 with weights of (1,2,3) results in (1,1,2), which totals 4.

What is the simplest way to fix this distribution algorithm?

UPDATE:

I expect the distributed values to be integer values. It doesn't matter exactly how the integers are distributed as long as they total to the correct value, and they are "as close as possible" to the correct distribution.

(By correct distribution I mean the non-integer distribution, and I haven't fully defined what I mean by "as close as possible." There are perhaps several valid outputs, so long as they total the original value.)

So, what is the desired output for distributing 3 with weights (1,2,3)? — Avaris, Jan 31 '12 at 23:11
Do you want float or integer values as return? What is the expected value here? (1,1,1) or (0,1,2) ? — ElKamina, Jan 31 '12 at 23:18
The easiest way given your incomplete specifications: remove "round". If you need integer results there are no exact solutions in many cases. What kind of result do you want in those cases? — Patrick, Jan 31 '12 at 23:18
@Patrick: The amounts distributed must be integer numbers (of cents, apples, kingdoms, whatever) otherwise there's no problem. The main criterion is that each share should be sufficiently close to the "float" answer that no participant has grounds for complaint. — John Machin, Jan 31 '12 at 23:39
@JohnMachin Thank you, that's a good definition of what I was looking for. And +1 for pointing out that there is no problem if we're not using integral values. — Buttons840, Feb 01 '12 at 00:12
I think [this question](http://stackoverflow.com/questions/792460/how-to-round-floats-to-integers-while-preserving-their-sum) and [this one](http://stackoverflow.com/questions/8685308/allocate-items-according-to-an-approximate-ratio-in-python) might be relevant for your case. In particular it seems like the second of those linked questions is asking nearly the same thing. — David Z, Feb 01 '12 at 22:18

score 9 · Accepted Answer · answered Jan 31 '12 at 23:33

Distribute the first share as expected. Now you have a simpler problem, with one fewer participants, and a reduced amount available for distribution. Repeat until there are no more participants.

>>> def distribute2(available, weights):
...     distributed_amounts = []
...     total_weights = sum(weights)
...     for weight in weights:
...         weight = float(weight)
...         p = weight / total_weights
...         distributed_amount = round(p * available)
...         distributed_amounts.append(distributed_amount)
...         total_weights -= weight
...         available -= distributed_amount
...     return distributed_amounts
...
>>> for x in xrange(100):
...     d = distribute2(x, (1,2,3))
...     if x != sum(d):
...         print x, sum(d), d
...
>>>

This solution is good because it doesn't require examining the values assigned to previous "buckets" in your for loop. It essentially does a +1 or -1 on the last bucket to ensure that the total is correct. — Buttons840, Feb 01 '12 at 00:11

score 2 · Answer 2 · answered Jan 31 '12 at 23:19

You have to distribute the rounding errors somehow:

Actual:
| |   |     |

Pixel grid:
|   |   |   |

The simplest would be to round each true value to the nearest pixel, for both the start and end position. So, when you round up block A 0.5 to 1, you also change the start position of the block B from 0.5 to 1. This decreases the size of B by 0.5 (in essence, "stealing" the size from it). Of course, this leads you to having B steal size from C, ultimately resulting in having:

|   |   |   |

but how else did you expect to divide 3 into 3 integral parts?

Greg Ra · Answer 3 · 2012-02-01T22:25:40.577

1

If you expect distributing 3 with weights of (1,2,3) to be equal to (0.5, 1, 1.5), then the rounding is your problem:

weighted_value = round(p*total)

You want:

weighted_value = p*total

EDIT: Solution to return integer distribution

def distribute(total, distribution):
  leftover = 0.0
  distributed_total = []
  distribution_sum = sum(distribution)
  for weight in distribution:
    weight = float(weight)
    leftover, weighted_value = modf(weight*total/distribution_sum + leftover)
    distributed_total.append(weighted_value)
  distributed_total[-1] = round(distributed_total[-1]+leftover) #mitigate round off errors
  return distributed_total

edited Feb 01 '12 at 22:25

answered Jan 31 '12 at 23:13

Greg Ra

70
3

I expected the distribution to contain only integer values. I didn't specify this in my original question, but it was implied in my code. – Buttons840 Jan 31 '12 at 23:32
updated the answer to contain a solution that returns integer distribution – Greg Ra Jan 31 '12 at 23:39
-1 It doesn't work. For example, `sum(distribute(19.0, 10*[1.0]))` produces `18.0`; it should be `19.0` – John Machin Feb 01 '12 at 08:09
ah yeah, the round off errors creep up... should be better now – Greg Ra Feb 01 '12 at 22:26

score 1 · Answer 4 · answered Jan 31 '12 at 23:16

1

The easiest approach is to calculate the normalization scale, which is the factor by which the sum of the weights exceeds the total you are aiming for, then divide each item in your weights by that scale.

def distribute(total, weights):
    scale = float(sum(weights))/total
    return [x/scale for x in weights]

answered Jan 31 '12 at 23:16

cheeken

33,663
4
35
42

PS - In case you are unfamiliar with it, that last line is using [list comprehension](http://docs.python.org/tutorial/datastructures.html#list-comprehensions), which is just a fancy way of putting a list-making `for` loop in one line. – cheeken Jan 31 '12 at 23:18
1

... and then your weights aren't integers anymore. Which was clearly wanted from the `round` call. – derobert Jan 31 '12 at 23:20

Distributing integers using weights? How to calculate?

4 Answers4

Linked