Generating list of probabilites

Question

I am trying to create a loop that will generate a series of probabilities against which a list of values can be compared. I need the probabilities to be in list form and the length of the list can vary. For example, if the list is lst = [1, 2, 3] and I want to set the probability of calling lst[2] at 80% with equal likelihood of the other two values, I would use p = [0.1, 0.1, 0.8].

I want to be able to cycle through all of the possible probabilities (with a definable step, in this case step = 0.1). My logic has got stuck at this point:

lst = [1, 2, 3]
step = 0.1

for p in probability_gen_function(lst, step):
    print(p)

loop_1 -> p = [0.1, 0.1, 0.8]
loop_2 -> p = [0.1, 0.2, 0.7]
---
loop_n -> p = [0.8, 0.1, 0.1]

I can see ways for my probability_gen_function to generate random lists of probabilities, but not how to systematically cycle through them. Any suggestions would be appreciated.

This is not an easy task. It's a problem of probabilities. You must take the variable x=1/step and find all combinations of specific length of integers, with sum==x In your example x=10, so all combinations (1,1,8), (1,2,7), (2,2,6) are valid as well as all the perumations of each combination. See here for some ideas how to work on this. (Your case is simplier than the general problem, because you need combinations to have specific length): https://stackoverflow.com/questions/4632322/finding-all-possible-combinations-of-numbers-to-reach-a-given-sum — IoaTzimas, Jan 30 '21 at 16:20
`The shape of the code that I need is` - sounds like you are asking for someone to write some code for you. If that is not the case, please update your question. Welcome to SO. This isn't a discussion forum or tutorial. Please take the [tour] and take the time to read [ask] and the other links found on that page. — wwii, Jan 30 '21 at 16:22
You apparently know how to iterate with a for loop. What is preventing you from iterating over values from 0 to 1 in .1 increments then using that for `step`? — wwii, Jan 30 '21 at 16:26
@wwii Your comments seem unwaranted. Just because someone may know about the parts of a solution doesn't mean he has the insight to put them together in the right order/mix. And most questions answer for some sort of code. The problem and solution were very well defined and he was asking for a path how to get there. — Reti43, Jan 30 '21 at 16:34
You have gotten some good answers here. I'll just point out that if `1/step` is an integer (number of stars), this reduces to a [stars and bars](https://en.wikipedia.org/wiki/Stars_and_bars_(combinatorics)) problem and you can look for implementations [here](https://stackoverflow.com/questions/28965734/general-bars-and-stars). — Reti43, Jan 30 '21 at 16:40
The difficulty in this problem is that the number of nested for loops depends on the length of ```lst```. As that is an input to the problem, you can't write a solution just with for loops (not easily anyway). The way to get around that is with recursion, but its entirely fair that someone would know about for loops but not recursion. It's a good and highly instructive question. — ScienceSnake, Jan 30 '21 at 16:40
@Reti43 is this a duplicate of the General stars and bars question you linked to? — wwii, Jan 30 '21 at 16:44
@wwii If you want the step size to add up to 1 exactly, then yes, it can be reduced to that problem. But I deemed it wasn't a duplicate per se, but rather relevant to link. — Reti43, Jan 30 '21 at 16:57
@wwii good point. I have updated the offending line to better show my meaning. Let me know if that sound better. — GalacticPonderer, Jan 30 '21 at 18:04

score 3 · Accepted Answer · answered Jan 30 '21 at 16:25

You can use a recursive generator function:

lst = [1, 2, 3]
step = 0.1
def prob_gen(d, s, c = []):
   if len(c) == len(d):
      yield [round(i, 2) for i in c]
   else:
      r = 10-sum(int(k*10) for k in c)
      for i in ([r] if len(c)+1 == len(d) else range(1, r)):
         yield from prob_gen(d, s, c+[s*i])


print(list(prob_gen(lst, step)))

Output:

[[0.1, 0.1, 0.8], [0.1, 0.2, 0.7], [0.1, 0.3, 0.6], [0.1, 0.4, 0.5], [0.1, 0.5, 0.4], [0.1, 0.6, 0.3], [0.1, 0.7, 0.2], [0.1, 0.8, 0.1], [0.2, 0.1, 0.7], [0.2, 0.2, 0.6], [0.2, 0.3, 0.5], [0.2, 0.4, 0.4], [0.2, 0.5, 0.3], [0.2, 0.6, 0.2], [0.2, 0.7, 0.1], [0.3, 0.1, 0.6], [0.3, 0.2, 0.5], [0.3, 0.3, 0.4], [0.3, 0.4, 0.3], [0.3, 0.5, 0.2], [0.3, 0.6, 0.1], [0.4, 0.1, 0.5], [0.4, 0.2, 0.4], [0.4, 0.3, 0.3], [0.4, 0.4, 0.2], [0.4, 0.5, 0.1], [0.5, 0.1, 0.4], [0.5, 0.2, 0.3], [0.5, 0.3, 0.2], [0.5, 0.4, 0.1], [0.6, 0.1, 0.3], [0.6, 0.2, 0.2], [0.6, 0.3, 0.1], [0.7, 0.1, 0.2], [0.7, 0.2, 0.1], [0.8, 0.1, 0.1]]

armamut · Answer 2 · 2021-01-30T16:26:58.433

We'll use numpy and itertools.


import numpy as np
from itertools import product

def probability_gen_function(lst, step):
    # Enumerate range of probabilities. 
    # This array is like [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9]
    probs = np.arange(step, 1, step)
    # Run through all permutations
    for p in product(probs, repeat=len(lst)):
        # Check if sum is 1
        # We check with some small number because of floating point errors 
        if np.abs(np.sum(p) - 1) > 1e-9:
            continue
        # Yield number
        yield np.array(p)

lst = [1, 2, 3]
step = 0.1

for p in probability_gen_function(lst, step):
    print(p.round(5))
>>>
[0.1 0.1 0.8]
[0.1 0.2 0.7]
...
[0.7 0.2 0.1]
[0.8 0.1 0.1]

There're some points to notice. First of all this function does not check step size. You should provide proper values. And, because we generate them with np.arange, there may be some floating point errors, you would want to normalize or round the probabilities. But I think this is enough to start.

What about (0.1, 0.1, 0.8) (0.2, 0.2, 0.6) (0.3, 0.3, 0.4) etc? They are not included in your output — IoaTzimas, Jan 30 '21 at 16:23
This is very wasteful. You're checking **all** possible lists and rejecting the ones that dont sum to one. As ```lst``` gets larger, that will be the overwhelming majority of them. — ScienceSnake, Jan 30 '21 at 16:35

ScienceSnake · Answer 3 · 2021-01-30T16:33:32.843

This is perfect for recursiveness. I found it easier to work with a constant step size of 1 and make the normalisation of the probability vary inside the loop, but it doesn't change the output.

I also explicitly check that the step size you give allows valid solutions.

def normalised_prob_gen(num_outcomes, step_size):
    num_steps = int(1 / step_size)
    assert num_steps*step_size == 1, "Step size does not divide into one"

    yield from probability_gen_function(num_outcomes, num_steps, num_steps)


def probability_gen_function(num_outcomes, num_steps, prob_left):
    if num_outcomes == 1:
        yield [prob_left / num_steps]
    else:
        for x in range(1, prob_left-num_outcomes+2):
            for rest in probability_gen_function(num_outcomes-1, num_steps, prob_left-x):
                yield [first / num_steps] + rest


for p in normalised_prob_gen(3, 0.1):
    print(p)

Note that the items in lst don't matter, only the number of them, so I use that as the input.

Generating list of probabilites

3 Answers3