6

I need to distribute a value based on some weights. For example, if my weights are 1 and 2, then I would expect the column weighted as 2 to have twice the value as the column weighted 1.

I have some Python code to demonstrate what I'm trying to do, and the problem:

def distribute(total, distribution):
    distributed_total = []
    for weight in distribution:
        weight = float(weight)
        p = weight/sum(distribution)
        weighted_value = round(p*total)
        distributed_total.append(weighted_value)
    return distributed_total

for x in xrange(100):
    d = distribute(x, (1,2,3))
    if x != sum(d):
        print x, sum(d), d

There are many cases shown by the code above where distributing a value results in the sum of the distribution being different than the original value. For example, distributing 3 with weights of (1,2,3) results in (1,1,2), which totals 4.

What is the simplest way to fix this distribution algorithm?

UPDATE:

I expect the distributed values to be integer values. It doesn't matter exactly how the integers are distributed as long as they total to the correct value, and they are "as close as possible" to the correct distribution.

(By correct distribution I mean the non-integer distribution, and I haven't fully defined what I mean by "as close as possible." There are perhaps several valid outputs, so long as they total the original value.)

Buttons840
  • 9,239
  • 15
  • 58
  • 85
  • So, what is the desired output for distributing 3 with weights (1,2,3)? – Avaris Jan 31 '12 at 23:11
  • Do you want float or integer values as return? What is the expected value here? (1,1,1) or (0,1,2) ? – ElKamina Jan 31 '12 at 23:18
  • The easiest way given your incomplete specifications: remove "round". If you need integer results there are no exact solutions in many cases. What kind of result do you want in those cases? – Patrick Jan 31 '12 at 23:18
  • 1
    @Patrick: The amounts distributed must be integer numbers (of cents, apples, kingdoms, whatever) otherwise there's no problem. The main criterion is that each share should be sufficiently close to the "float" answer that no participant has grounds for complaint. – John Machin Jan 31 '12 at 23:39
  • @JohnMachin Thank you, that's a good definition of what I was looking for. And +1 for pointing out that there is no problem if we're not using integral values. – Buttons840 Feb 01 '12 at 00:12
  • 1
    I think [this question](http://stackoverflow.com/questions/792460/how-to-round-floats-to-integers-while-preserving-their-sum) and [this one](http://stackoverflow.com/questions/8685308/allocate-items-according-to-an-approximate-ratio-in-python) might be relevant for your case. In particular it seems like the second of those linked questions is asking nearly the same thing. – David Z Feb 01 '12 at 22:18

4 Answers4

9

Distribute the first share as expected. Now you have a simpler problem, with one fewer participants, and a reduced amount available for distribution. Repeat until there are no more participants.

>>> def distribute2(available, weights):
...     distributed_amounts = []
...     total_weights = sum(weights)
...     for weight in weights:
...         weight = float(weight)
...         p = weight / total_weights
...         distributed_amount = round(p * available)
...         distributed_amounts.append(distributed_amount)
...         total_weights -= weight
...         available -= distributed_amount
...     return distributed_amounts
...
>>> for x in xrange(100):
...     d = distribute2(x, (1,2,3))
...     if x != sum(d):
...         print x, sum(d), d
...
>>>
John Machin
  • 81,303
  • 11
  • 141
  • 189
2

You have to distribute the rounding errors somehow:

Actual:
| |   |     |

Pixel grid:
|   |   |   |

The simplest would be to round each true value to the nearest pixel, for both the start and end position. So, when you round up block A 0.5 to 1, you also change the start position of the block B from 0.5 to 1. This decreases the size of B by 0.5 (in essence, "stealing" the size from it). Of course, this leads you to having B steal size from C, ultimately resulting in having:

|   |   |   |

but how else did you expect to divide 3 into 3 integral parts?

derobert
  • 49,731
  • 15
  • 94
  • 124
1

If you expect distributing 3 with weights of (1,2,3) to be equal to (0.5, 1, 1.5), then the rounding is your problem:

weighted_value = round(p*total)

You want:

weighted_value = p*total

EDIT: Solution to return integer distribution

def distribute(total, distribution):
  leftover = 0.0
  distributed_total = []
  distribution_sum = sum(distribution)
  for weight in distribution:
    weight = float(weight)
    leftover, weighted_value = modf(weight*total/distribution_sum + leftover)
    distributed_total.append(weighted_value)
  distributed_total[-1] = round(distributed_total[-1]+leftover) #mitigate round off errors
  return distributed_total
Greg Ra
  • 70
  • 3
1

The easiest approach is to calculate the normalization scale, which is the factor by which the sum of the weights exceeds the total you are aiming for, then divide each item in your weights by that scale.

def distribute(total, weights):
    scale = float(sum(weights))/total
    return [x/scale for x in weights]
cheeken
  • 33,663
  • 4
  • 35
  • 42
  • PS - In case you are unfamiliar with it, that last line is using [list comprehension](http://docs.python.org/tutorial/datastructures.html#list-comprehensions), which is just a fancy way of putting a list-making `for` loop in one line. – cheeken Jan 31 '12 at 23:18
  • 1
    ... and then your weights aren't integers anymore. Which was clearly wanted from the `round` call. – derobert Jan 31 '12 at 23:20