67

Let's say I have an array of floating point numbers, in sorted (let's say ascending) order, whose sum is known to be an integer N. I want to "round" these numbers to integers while leaving their sum unchanged. In other words, I'm looking for an algorithm that converts the array of floating-point numbers (call it fn) to an array of integers (call it in) such that:

  1. the two arrays have the same length
  2. the sum of the array of integers is N
  3. the difference between each floating-point number fn[i] and its corresponding integer in[i] is less than 1 (or equal to 1 if you really must)
  4. given that the floats are in sorted order (fn[i] <= fn[i+1]), the integers will also be in sorted order (in[i] <= in[i+1])

Given that those four conditions are satisfied, an algorithm that minimizes the rounding variance (sum((in[i] - fn[i])^2)) is preferable, but it's not a big deal.

Examples:

[0.02, 0.03, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.11, 0.12, 0.13, 0.14]
    => [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[0.1, 0.3, 0.4, 0.4, 0.8]
    => [0, 0, 0, 1, 1]
[0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]
    => [0, 0, 0, 0, 0, 0, 0, 0, 0, 1]
[0.4, 0.4, 0.4, 0.4, 9.2, 9.2]
    => [0, 0, 1, 1, 9, 9] is preferable
    => [0, 0, 0, 0, 10, 10] is acceptable
[0.5, 0.5, 11]
    => [0, 1, 11] is fine
    => [0, 0, 12] is technically not allowed but I'd take it in a pinch

To answer some excellent questions raised in the comments:

  • Repeated elements are allowed in both arrays (although I would also be interested to hear about algorithms that work only if the array of floats does not include repeats)
  • There is no single correct answer - for a given input array of floats, there are generally multiple arrays of ints that satisfy the four conditions.
  • The application I had in mind was - and this is kind of odd - distributing points to the top finishers in a game of MarioKart ;-) Never actually played the game myself, but while watching someone else I noticed that there were 24 points distributed among the top 4 finishers, and I wondered how it might be possible to distribute the points according to finishing time (so if someone finishes with a large lead they get a larger share of the points). The game tracks point totals as integers, hence the need for this kind of rounding.

For the curious, here is the test script I used to identify which algorithms worked.

David Z
  • 128,184
  • 27
  • 255
  • 279
  • What happens if you have an array of 1000 .001's? How do you want it to behave? Are repeats allowed? – ojblass Apr 27 '09 at 07:00
  • 2
    @ojblass: in that case, you would round 999 of them down to 0 and then round the last one up to 1. That would satisfy the requirement. – Jason Coco Apr 27 '09 at 07:04
  • In your first example: Why would 0.14 round to 1? – Brian Rasmussen Apr 27 '09 at 07:06
  • @Brian: to satisfy his requirement. Round all numbers naturally, then find the difference needed to make the sum and round in that direction. – Jason Coco Apr 27 '09 at 07:08
  • 4
    I have seen this needed in applications (estimation software) where everything is rounded off to dollars and bottom line numbers have to match. – ojblass Apr 27 '09 at 07:45
  • Here is a related question: http://stackoverflow.com/questions/3611015/find-the-highest-number-in-a-set-to-be-rounded-down-and-round-it-up-instead – Mnebuerquo Oct 01 '10 at 18:27
  • 1
    Also another related question: http://stackoverflow.com/questions/13483430/how-to-make-rounded-percentages-add-up-to-100 – CMCDragonkai May 20 '16 at 04:46
  • Related: https://stackoverflow.com/q/15769948/781723, https://stackoverflow.com/q/16226991/781723, https://stackoverflow.com/q/35931885/781723, https://stackoverflow.com/q/32544646/781723, https://stackoverflow.com/q/13483430/781723 – D.W. Feb 05 '23 at 21:16
  • Related: [Batch rounding with preservation of a sum](https://cs.stackexchange.com/q/151051/91753). – burnabyRails Feb 05 '23 at 21:47

13 Answers13

31

One option you could try is "cascade rounding".

For this algorithm you keep track of two running totals: one of floating point numbers so far, and one of the integers so far. To get the next integer you add the next fp number to your running total, round the running total, then subtract the integer running total from the rounded running total:-

number  running total   integer integer running total
   1.3       1.3          1           1
   1.7       3.0          2           3
   1.9       4.9          2           5
   2.2       8.1          3           8
   2.8      10.9          3          11
   3.1      14.0          3          14
dino
  • 3,093
  • 4
  • 31
  • 50
James Anderson
  • 27,109
  • 7
  • 50
  • 78
  • I think this would have more accurate rounding than mine. – ojblass Apr 27 '09 at 07:35
  • I think this one invalidates the "If fn is sorted -> in is sorted" rule without an extra sort step. [0.4, 0.2, 0.4, 0.4, 0.2, 0.4, ...] will round to [0, 1, 0, 0, 1, 0, ... ]. – Mikko Rantanen Apr 27 '09 at 07:43
  • @mikko: is [0.4, 0.2, 0.4, 0.4, 0.2, 0.4, ...] really sorted? – artificialidiot Apr 27 '09 at 10:27
  • 1
    Er. Good point. I mixed two counter examples. The original was [ 0.3, 0.3, 0.3, 0.3, ... ] which will become something like [0, 1, 0, 0, ...]. The [0.4, 0.2, 0.4, 0.2, ..] example should have been [0.4, 1.2, 2.4] which demonstrates less than optimal roundoff error. as the first 0.4 rounds off to 0 (error 0.4), 1.2 to 2 (error sum 1.2) and 2.4 again to 2 (error sum 2.8) while the optimal would round 0.4 or 2.4 up (error 0.6), the remaining .4 down (error sum 1.0 and 1.2 down (error sum 1.2). – Mikko Rantanen Apr 27 '09 at 14:43
  • Mikko: good point about the sorted rule. Also be aware that this looks really odd when there are lots of values close to 0 and 1 . – James Anderson Apr 28 '09 at 07:21
  • 1
    I've been testing out the solutions and it looks like if you drop the sorting requirement, this works, although it's still not optimal. – David Z Oct 21 '11 at 04:24
  • 3
    For anyone else, here is a javascript fiddle implementation of this algorithm: https://jsfiddle.net/cd8xqy6e/ – Jake Mar 31 '17 at 16:15
22

Here is one algorithm which should accomplish the task. The main difference to other algorithms is that this one rounds the numbers in correct order always. Minimizing roundoff error.

The language is some pseudo language which probably derived from JavaScript or Lua. Should explain the point. Note the one based indexing (which is nicer with x to y for loops. :p)

// Temp array with same length as fn.
tempArr = Array(fn.length)

// Calculate the expected sum.
arraySum = sum(fn)

lowerSum = 0
-- Populate temp array.
for i = 1 to fn.lengthf
    tempArr[i] = { result: floor(fn[i]),              // Lower bound
                   difference: fn[i] - floor(fn[i]),  // Roundoff error
                   index: i }                         // Original index

    // Calculate the lower sum
    lowerSum = lowerSum + tempArr[i].result
end for

// Sort the temp array on the roundoff error
sort(tempArr, "difference")

// Now arraySum - lowerSum gives us the difference between sums of these
// arrays. tempArr is ordered in such a way that the numbers closest to the
// next one are at the top.
difference = arraySum - lowerSum

// Add 1 to those most likely to round up to the next number so that
// the difference is nullified.
for i = (tempArr.length - difference + 1) to tempArr.length
    tempArr.result = tempArr.result + 1
end for

// Optionally sort the array based on the original index.
array(sort, "index")
Bergi
  • 630,263
  • 148
  • 957
  • 1,375
Mikko Rantanen
  • 7,884
  • 2
  • 32
  • 47
16

One really easy way is to take all the fractional parts and sum them up. That number by the definition of your problem must be a whole number. Distribute that whole number evenly starting with the largest of your numbers. Then give one to the second largest number... etc. until you run out of things to distribute.

Note this is pseudocode... and may be off by one in an index... its late and I am sleepy.

float accumulator = 0;

for (i = 0; i < num_elements; i++)  /* assumes 0 based array */
{
   accumulator += (fn[i] - floor(fn[i])); 
   fn[i] =  (fn[i] - floor(fn[i]);
}

i = num_elements;

while ((accumulator > 0) && (i>=0))
{
    fn[i-1] += 1;   /* assumes 0 based array */
    accumulator -= 1;
    i--;
}

Update: There are other methods of distributing the accumulated values based on how much truncation was performed on each value. This would require keeping a seperate list called loss[i] = fn[i] - floor(fn[i]). You can then repeat over the fn[i] list and give 1 to the greatest loss item repeatedly (setting the loss[i] to 0 afterwards). Its complicated but I guess it works.

ojblass
  • 21,146
  • 22
  • 83
  • 132
  • ... and skipping those numbers who, when incremented would become unsorted. – MSalters Apr 27 '09 at 07:12
  • 2
    hmmm i think by going from the largest down you would by definition keep the sort order... am I missing something? example? – ojblass Apr 27 '09 at 07:16
  • If you are starting with the largest, second largest, etc - then how would they become unsorted? – Marc Gravell Apr 27 '09 at 07:21
  • 1
    Nice answer - I like this ;-p – Marc Gravell Apr 27 '09 at 07:21
  • What about [ 0.4, 0.4, 0.4, 0.4, 9.2, 9.2 ]? I believe the algorithm should provide an answer of [ 0, 0, 1, 1, 9, 9 ] here. – Mikko Rantanen Apr 27 '09 at 07:23
  • I think my algorithm would make it [0, 0, 0, 0, 10, 10] – ojblass Apr 27 '09 at 07:26
  • @ojblass - seconded (just had to think for a second ;-p) - which is a perfectly legal answer to the problem. – Marc Gravell Apr 27 '09 at 07:28
  • I think it somewhat satisfies the conditions but no telling until clarification is provided. – ojblass Apr 27 '09 at 07:28
  • Actually I think even the reference [1.3, 1.7, 1.9, 2.2, 2.8, 3.1] => [1, 2, 2, 2, 3, 3] fails? First pass you get fn[i] of [1, 1, 1, 2, 2, 3] as you were taking a floor of the numbers. Next you'll increment the largest numbers giving you [1, 1, 1, 3, 3, 4]. – Mikko Rantanen Apr 27 '09 at 07:30
  • @Mikko - the [1,2,2,3,3] is just *a* feasible solution. There are often many feasible solutions to such problems. [1,1,1,3,3,4] is just as feasible. – Marc Gravell Apr 27 '09 at 07:32
  • [1, 1, 1, 3, 3, 4] rounds 1.9 to 1 and 2.2 to 3. I believe the idea was to round as accurately as possible. – Mikko Rantanen Apr 27 '09 at 07:32
  • The problem is that the there are multiple rounding strategies yielding different answers... – ojblass Apr 27 '09 at 07:33
  • I am beginning to like James's solution – ojblass Apr 27 '09 at 07:34
  • Well, I'd say the references are part of the question. *Goes add a TDD tag* ;) But my point is more or less: As there is a way to minimize the roundoff error, why not do so? – Mikko Rantanen Apr 27 '09 at 07:36
  • And and and! (Yes.. I admit I just can't stop..) Given [ 0.5, 0.5, 11 ] this will be rounded to [ 0, 0, 12 ] in which case 12-11 is 1 while the original question required that the difference is less than 1. And I hope you're not taking this personally. I'm just so very surprised at the amount of answers which just round the highest numbers, all of which have the same problems. – Mikko Rantanen Apr 27 '09 at 08:31
  • I've just been testing out these solutions and when I take into account the "Update" paragraph, this actually seems to find the optimal solution in all my test cases. It's probably about equivalent to Mikko's answer. (Unfortunately I can't accept both.) – David Z Oct 21 '11 at 04:26
4

How about:

a) start: array is [0.1, 0.2, 0.4, 0.5, 0.8], N=3, presuming it's sorted
b) round them all the usual way: array is [0 0 0 1 1]
c) get the sum of the new array and subtract it from N to get the remainder.
d) while remainder>0, iterate through elements, going from the last one
   - check if the new value would break rule 3.
   - if not, add 1
e) in case that remainder<0, iterate from first one to the last one
   - check if the new value would break rule 3.
   - if not, subtract 1
vgru
  • 49,838
  • 16
  • 120
  • 201
  • One could determine the top n values, (where n is the difference between the summed floats and the summed integers) this would remove the need to sort the array. Using a heap getting the top n values can be made a O(n) operation. – Georg Schölly Apr 27 '09 at 07:36
3

Essentially what you'd do is distribute the leftovers after rounding to the most likely candidates.

  1. Round the floats as you normally would, but keep track of the delta from rounding and associated index into fn and in.
  2. Sort the second array by delta.
  3. While sum(in) < N, work forwards from the largest negative delta, incrementing the rounded value (making sure you still satisfy rule #3).
  4. Or, while sum(in) > N, work backwards from the largest positive delta, decrementing the rounded value (making sure you still satisfy rule #3).

Example:

[0.02, 0.03, 0.05, 0.06, 0.07, 0.08, 0.09, 0.1, 0.11, 0.12, 0.13, 0.14] N=1

1. [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] sum=0
and [[-0.02, 0], [-0.03, 1], [-0.05, 2], [-0.06, 3], [-0.07, 4], [-0.08, 5], 
     [-0.09, 6], [-0.1, 7], [-0.11, 8], [-0.12, 9], [-0.13, 10], [-0.14, 11]]

2. sorting will reverse the array

3. working from the largest negative remainder, you get [-0.14, 11].
Increment `in[11]` and you get [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1] sum=1 
Done.
lc.
  • 113,939
  • 20
  • 158
  • 187
  • And buy a flux capacitor while you are at it. – ojblass Apr 27 '09 at 07:14
  • Yeah, definitely agreed it's a bit of a mess, but it ought to work. – lc. Apr 27 '09 at 07:18
  • This should work. Though you could make it a bit clearer by always rounding to one direction (such as down) and then working from the largest abs(delta) and compensating. – Mikko Rantanen Apr 27 '09 at 07:47
  • 1
    Sorting the array by the fractional part will be an extra O(NlgN) step, but it will allow one to minimize the worst-case rounding error. For some data sets it may still be arbitrarily close to one (e.g. if there are a million values, all of which end in .999999, one will have to be rounded down) but sorting by delta will allow one to achieve the minimum absolute rounding error that is achievable with any given data set. – supercat Sep 17 '13 at 15:12
1

Can you try something like this?

in [i] = fn [i] - int (fn [i]);
fn_res [i] = fn [i] - in [i];

fn_res → is the resultant fraction. (I thought this was basic ...), Are we missing something?

Alphaneo
  • 12,079
  • 22
  • 71
  • 89
  • Let's substitute `in [i]` with its value: `fn_res [i] = fn [i] - ( fn [i] - int (fn [i]) ) = int (fn [i]) `. So, it seems that something was missing ;) – ruvim Mar 26 '15 at 19:36
1

Well, 4 is the pain point. Otherwise you could do things like "usually round down and accumulate leftover; round up when accumulator >= 1". (edit: actually, that might still be OK as long as you swapped their position?)

There might be a way to do it with linear programming? (that's maths "programming", not computer programming - you'd need some maths to find the feasible solution, although you could probably skip the usual "optimisation" part).

As an example of the linear programming - with the example [1.3, 1.7, 1.9, 2.2, 2.8, 3.1] you could have the rules:

1 <= i < 2
1 <= j < 2
1 <= k < 2
2 <= l < 3
3 <= m < 4
i <= j <= k <= l <= m
i + j + k + l + m = 13

Then apply some linear/matrix algebra ;-p Hint: there are products to do the above based on things like the "Simplex" algorithm. Common university fodder, too (I wrote one at uni for my final project).

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
1

The problem, as I see it, is that the sorting algorithm is not specified. Or more like - whether it's a stable sort or not.

Consider the following array of floats:

[ 0.2 0.2 0.2 0.2 0.2 ]

The sum is 1. The integer array then should be:

[ 0 0 0 0 1 ]

However, if the sorting algorithm isn't stable, it could sort the "1" somewhere else in the array...

Vilx-
  • 104,512
  • 87
  • 279
  • 422
  • In that sense it should be a stable sort, then. [0 0 0 0 1] would be a desired result. The intent of rule 4 is to say that [0 0 0 1 0] would be unacceptable. (Good point though) – David Z Apr 27 '09 at 15:27
0

Below a python and numpy implementation of @mikko-rantanen 's code. It took me a bit to put this together, so this may be helpful to future Googlers despite the age of the topic.

import numpy as np
from math import floor

original_array = np.array([1.2, 1.5, 1.4, 1.3, 1.7, 1.9])

# Calculate length of original array
# Need to substract 1, as indecies start at 0, but product of dimensions
# results in a count starting at 1
array_len = original_array.size - 1 # Index starts at 0, but product at 1

# Calculate expected sum of original values (must be integer)
expected_sum = np.sum(original_array)

# Collect values for temporary array population
array_list = []
lower_sum = 0
for i, j in enumerate(np.nditer(original_array)):
    array_list.append([i, floor(j), j - floor(j)]) # Original index, lower bound, roundoff error
# Calculate the lower sum of values
lower_sum += floor(j)

# Populate temporary array
temp_array = np.array(array_list)

# Sort temporary array based on roundoff error
temp_array = temp_array[temp_array[:,2].argsort()]

# Calculate difference between expected sum and the lower sum
# This is the number of integers that need to be rounded up from the lower sum
# The sort order (roundoff error) ensures that the value closest to be
# rounded up is at the bottom of the array
difference = int(expected_sum - lower_sum)

# Add one to the number most likely to round up to eliminate the difference
temp_array_len, _ = temp_array.shape
for i in xrange(temp_array_len - difference, temp_array_len):
    temp_array[i,1] += 1

# Re-sort the array based on original index
temp_array = temp_array[temp_array[:,0].argsort()]

# Return array to one-dimensional format of original array
array_list = []
for i in xrange(temp_array_len):
    array_list.append(int(temp_array[i,1]))
new_array = np.array(array_list)
ninetynine
  • 327
  • 2
  • 10
0

Calculate sum of floor and sum of numbers. Round sum of numbers, and subtract with sum of floor, the difference is how many ceiling we need to patch(how many +1 we need). Sorting the array with its difference of ceiling to number, from small to large.

For diff times(diff is how many ceiling we need to patch), we set result as ceiling of number. Others set result as floor of numbers.

public class Float_Ceil_or_Floor {

public static int[] getNearlyArrayWithSameSum(double[] numbers) {

    NumWithDiff[] numWithDiffs = new NumWithDiff[numbers.length];
    double sum = 0.0;
    int floorSum = 0;
    for (int i = 0; i < numbers.length; i++) {
        int floor = (int)numbers[i];
        int ceil = floor;
        if (floor < numbers[i]) ceil++; // check if a number like 4.0 has same floor and ceiling
        floorSum += floor;
        sum += numbers[i];
        numWithDiffs[i] = new NumWithDiff(ceil,floor, ceil - numbers[i]);
    }

    // sort array by its diffWithCeil
    Arrays.sort(numWithDiffs, (a,b)->{
        if(a.diffWithCeil < b.diffWithCeil)  return -1;
        else return 1;
    });

    int roundSum = (int) Math.round(sum);
    int diff = roundSum - floorSum;
    int[] res = new int[numbers.length];

    for (int i = 0; i < numWithDiffs.length; i++) {
        if(diff > 0 && numWithDiffs[i].floor != numWithDiffs[i].ceil){
            res[i] = numWithDiffs[i].ceil;
            diff--;
        } else {
            res[i] = numWithDiffs[i].floor;
        }
    }
    return res;
}
public static void main(String[] args) {
    double[] arr = { 1.2, 3.7, 100, 4.8 };
    int[] res = getNearlyArrayWithSameSum(arr);
    for (int i : res) System.out.print(i + " ");

}

}

class NumWithDiff {
    int ceil;
    int floor;
    double diffWithCeil;
    public NumWithDiff(int c, int f, double d) {
        this.ceil = c;
        this.floor = f;
        this.diffWithCeil = d;
    }
}
Jerry Z.
  • 2,031
  • 3
  • 22
  • 28
0

Without minimizing the variance, here's a trivial one:

  1. Sort values from left to right.
  2. Round all down to the next integer.
  3. Let the sum of those integers be K. Increase the N-K rightmost values by 1.
  4. Restore original order.

This obviously satisfies your conditions 1.-4. Alternatively, you could round to the closest integer, and increase N-K of the ones you had rounded down. You can do this greedily by the difference between the original and rounded value, but each run of rounded-down values must only be increased from right to left, to maintain sorted order.

user32849
  • 609
  • 1
  • 6
  • 16
  • This seems to be functionally the same algorithm that [ojblass already posted](https://stackoverflow.com/a/792473/56541). – David Z Feb 28 '18 at 20:24
0

If you can accept a small change in the total while improving the variance this will probabilistically preserve totals in python:

import math
import random
integer_list = [int(x) + int(random.random() <= math.modf(x)[0]) for x in my_list]

to explain it rounds all numbers down and adds one with a probability equal to the fractional part i.e. one in ten 0.1 will become 1 and the rest 0

this works for statistical data where you are converting a large numbers of fractional persons into either 1 person or 0 persons

alexd
  • 31
  • 3
0

Make the summed diffs are to be under 1, and check to be sorted. some like,

while(i < sizeof(fn) / sizeof(float)) {
    res += fn[i] - floor(fn[i]);
    if (res >= 1) {
        res--;
        in[i] = ceil(fn[i]);
    }
    else
        in[i] = floor(fn[i]);
    if (in[i-1] > in[i])
        swap(in[i-1], in[i++]);
}

(it's paper code, so i didn't check the validity.)

Bruno Gelb
  • 5,322
  • 8
  • 35
  • 50