1

I multiply many probabilities into one.

P.S. The probability won't never be zero (it'll be 0.01) and it won't never be hundrer (it'll be 0.99).

for probabilities in get_random_list_of_probabilities():
    a = 1
    b = 1
    for probability in probabilities:
        a *= probability
        b *= (1 - probability)

    if a < b:
        a_is_greater += 1
    if b > a:
        b_is_greater += 1

After some iterations, the a can be about 5.087e-258.

According to sys.float_info, the minimum value my Python can handle is about 2.225e-308.

I am afraid of running my code on other machines.

How can I normalize my values?

Thanks a lot!

John Doe
  • 11
  • 2
  • 1
    This problem is often solved by working in log space. See also https://stackoverflow.com/questions/3704570/in-python-small-floats-tending-to-zero – Mark Dec 21 '20 at 15:47
  • You could also take a look at the `decimal` module which allows you to represent arbitrarily small numbers only limited by your available memory. – sunnytown Dec 21 '20 at 15:49
  • Working in log space is cool, but I have one question. When I compare `a` and `b` after for loop; when `a` is greater than `b` and when `b` is greater than `a`. The ratio between these two values is different than ratio between these two values when I work in log space, why? P.S. Please, look at the post to view the point of comparing. – John Doe Dec 21 '20 at 16:25

1 Answers1

0

Sounds like you just want to clip the two numbers to within the (0.01, 0.99) (which will obviously be representable anywhere):

def normalize(p: float) -> float:
    """Normalize a probability to within 0.01 and 0.99."""
    return max(0.01, min(0.99, p))
for probabilities in get_random_list_of_probabilities():
    a = 1
    b = 1
    for probability in probabilities:
        a = normalize(a * probability)
        b = normalize(b * (1 - probability))

    if a < b:
        a_is_greater += 1
    if b > a:
        b_is_greater += 1
Samwise
  • 68,105
  • 3
  • 30
  • 44
  • Hi. Look at the post I added the part with comparing. These two values `a_is_greater` and `b_is_greater` are different when I use your function. I think it doesn't work well. – John Doe Dec 21 '20 at 16:29
  • You don't want to normalize those values, just `a` and `b`. And you may indeed get different results since you're deliberately changing the values to fit within the bounds you gave; depending on the random distribution, you may end up with significantly larger or smaller numbers than you would have gotten without that clipping. But also, if all you care about is which is greater, why do you even care about normalizing `a` and `b` in the first place? – Samwise Dec 21 '20 at 16:40
  • I'll give you an example. I'll multiply `a` with `0.5` 1000 times. It gives me `1*(0.5^1000) = 9.332636e-302`. And if I exceed the `e-305` limit, I'll get `a = 0`, because of my Python limitation. And this is my problem. I don't exactly know what do I need. I have been wondering about normalizing `a` variable after each multiplication. – John Doe Dec 21 '20 at 16:49
  • Do you not want `0.01` in that case? If I do `for _ in range(1000): a = normalize(a * 0.5)` I get `0.01` as the final result. – Samwise Dec 21 '20 at 16:59
  • I am sorry, I don't know where should I paste the code. https://pastebin.com/RSmp0B9w Look at this code. When I use your normalizing, I get that `a == b`. When I comment your part and uncomment my part, the basic multiplication, I get `b is greater`. – John Doe Dec 21 '20 at 17:12
  • I edited my answer to reference the code in your edited question. – Samwise Dec 21 '20 at 19:24