7

I know how to round a number in Python, this is not a simple technical issue.

My issue is that rounding will make a set of percentages not adding up to 100%, when, technically, they should.

For example:

a = 1
b = 14

I want to compute the percentage of a in (a + b) and b in (a + b).

The answer should be

a/(a + b) = 1/15 
b/(a + b) = 14/15

When I try to round those numbers, I got

1/15 = 6.66 
14/15 = 93.33 

(I was doing the flooring), which makes those two number doesn't add up to 100%.

In this case, we should do ceiling for 1/15, which is 6.67, and flooring for 14/15, which is 93.33. And now they add up to 100%. The rule in this case should be "rounding to the nearest number"

However, if we have a more complicate case, say 3 numbers:

a = 1
b = 7
c = 7

flooring:

1/15 = 6.66
7/15 = 46.66
7/15 = 46.66

Doesn't add up to 100%.

ceiling:

1/15 = 6.67
7/15 = 46.67
7/15 = 46.67

doesn't add up to 100%.

Rounding (to nearest number) is same as ceiling. Still doesn't add up to 100%.

So my question is what should I do to make sure they all add up to 100% in any cases.

Thanks in advance.

UPDATE: Thanks for the tips from comments. I have took the "Largest Remainder" solution from the duplicate Post answer.

The code are:

def round_to_100_percent(number_set, digit_after_decimal=2):
    """
        This function take a list of number and return a list of percentage, which represents the portion of each number in sum of all numbers
        Moreover, those percentages are adding up to 100%!!!
        Notice: the algorithm we are using here is 'Largest Remainder'
        The down-side is that the results won't be accurate, but they are never accurate anyway:)
    """
    unround_numbers = [x / float(sum(number_set)) * 100 * 10 ** digit_after_decimal for x in number_set]
    decimal_part_with_index = sorted([(index, unround_numbers[index] % 1) for index in range(len(unround_numbers))], key=lambda y: y[1], reverse=True)
    remainder = 100 * 10 ** digit_after_decimal - sum([int(x) for x in unround_numbers])
    index = 0
    while remainder > 0:
        unround_numbers[decimal_part_with_index[index][0]] += 1
        remainder -= 1
        index = (index + 1) % len(number_set)
    return [int(x) / float(10 ** digit_after_decimal) for x in unround_numbers]

Tested, seems work fine.

Jerry Meng
  • 1,466
  • 3
  • 21
  • 40
  • 6
    Floating point numbers are never exact (neither in decimal, nor in binary). If you really need that kind of accuracy, store them as actual fractions. See also [What Every Computer Scientist Should Know About Floating-Point Arithmetic](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) – hlt Aug 12 '14 at 18:22
  • 2
    Don't make sure they add up to 100%. That would just mean rounding some numbers by different rules than other. Instead, add a remark like "might not add up to 100% due to rounding" if this is for human readable representation, or don't round if you intent to calculate with it. – tobias_k Aug 12 '14 at 18:22
  • @hlt, I know the reason behind this. However, don't you think it is kind of stupid to show a donut chart to your client which doesn't add up to 100%? People usually won't notice it, and Developers can even understand it, but for non-tech people, who actually notice it, it will make them feel you are not professional. "Hey Man, Can you even make those numbers add up to 100%? Do you even know Math?". And that is the point of this post. Thank you for you reply though~~ – Jerry Meng Aug 12 '14 at 18:29
  • @Cyber, that post is interesting. Thank you. – Jerry Meng Aug 12 '14 at 18:39
  • @tobias_k, I totally agree with you, but I am talking about solving a pragmatic issue rather than a scientific issue. And you are right, those people prefer a screwed data set as long as they add up to 100%. No one will try to compute those percentages (they don't have the data to do it), but they would simply summing the percentage. Even percentage are not accurate (actually they are not that bad) they won't know it, but they will know it if they don't add up to 100%. – Jerry Meng Aug 12 '14 at 18:54
  • @JerryMeng In this case you have plenty of good answers on that duplicate question. Still, I think this is really strange... What if there are 1 million values, each with 1/10000 of a percent? Or just what if each of three values has exactly 1/3? – tobias_k Aug 12 '14 at 19:23
  • @tobias_k, yes, and I like one of those good answers. Thanks for your tips. And for your question, well, there are no perfect solutions. However, as long as we accept the assumption that nothing is accurate, then we can live with it. If three numbers are all 1/3, then random pick one and make it 33.34%. Not perfect, but solve the problem. And in that case, everyone who knows math will know what happened, even they don't know programming. Thanks again for your advice. – Jerry Meng Aug 12 '14 at 19:34

2 Answers2

0

As others have commented, if your numbers are nice and round as in your example, you can use the fractions module to retain the accuracy of the rational numbers:

In [2]: from fractions import Fraction

In [5]: a = Fraction(1)

In [6]: b = Fraction(14)

In [7]: a/(a+b)
Out[7]: Fraction(1, 15)

In [8]: a/(a+b) + (b/(a+b))
Out[8]: Fraction(1, 1)

This obviously doesn't look good if you have really odd fractions.

Noah
  • 21,451
  • 8
  • 63
  • 71
-2

Welcome to IEEE Floats.

The floating point numbers returned from math operations in python are approximates. On some values, the sum of percentages will be greater than 100.

You have two solutions: use the fraction or decimal modules OR, simply not want them to add up to 100%.

Tritium21
  • 2,845
  • 18
  • 27
  • Interesting that instant down-vote. That is what the questioner is experiencing; the lack of precision in the IEEE float. – Tritium21 Aug 12 '14 at 18:24
  • 2
    Right, but it does not answer the question: "So my question is what should I do to make sure they all add up to 100% in any cases." (PS: Not the downvoter) – tobias_k Aug 12 '14 at 18:25
  • @tobias_k Fair enough, added two solutions. – Tritium21 Aug 12 '14 at 18:28