2

I saw a many solutions for generating random floats within a specific range (like this) which actually helps me, and solutions for generating random floats summing to 1 (like this), and separately solutions work perfectly, but I can't figure how to merge them.

Currently my code is:

import random
def sample_floats(low, high, k=1):
    """ Return a k-length list of unique random floats
        in the range of low <= x <= high
    """
    result = []
    seen = set()
    for i in range(k):
        x = random.uniform(low, high)
        while x in seen:
            x = random.uniform(low, high)
        seen.add(x)
        result.append(x)
    return result

And still, applying

weights = sample_floats(0.055, 1.0, 11)
weights /= np.sum(weights)

Returns weights array, in which there are some floats less that 0.055

Should I somehow implement np.random.dirichlet in function above, or it should be built on the basis of np.random.dirichlet and then implement condition > 0.055? Can't figure any solution.

Thank you in advice!

ker_laeda86
  • 287
  • 3
  • 13
  • Can you explain what you're trying to achieve? Can you provide an example output? – mozway Jan 04 '22 at 09:42
  • @mozway output should be randomly generated array of floats k-length, each of them is >min.value (`0.055` in my case), and their sum results to `1`. It is for generating weights for stocks, which I don't want any of them to be 0.00001, but want them to sum to 1 – ker_laeda86 Jan 04 '22 at 09:50
  • 1
    thanks for clarifying, I provided an [answer](https://stackoverflow.com/a/70577221/16343464) that does not require loops or iterations. As you'll see in the answer, be aware that setting constraints limit what you can do. A higher `low` limits the number of `k`, and I am pretty sure is it mathematically impossible to fix `low`, `k` **and** `high` at the same time. – mozway Jan 04 '22 at 10:24

3 Answers3

1

The samples are correlated, so I believe you can't generate them in an IID way. you can, however, do it in an iterative manner. For example, you can do it as I show in the code below. There are a few more special cases to check like what if the user inputs low<high or high*k<sum. But I figured you can find and account for them using my modification to your code.

import random
import warnings
  

def sample_floats(low = 0.055, high = 1., x_sum = 1., k = 1):
    """ Return a k-length list of unique random floats
        in the range of 'low' <= x <= 'high' summing up to 'sum'.
    """
    sum_i = 0
    xs = []
    
    if x_sum - (k-1)*low < high:
        warnings.warn(f'high = {high} is to high to be generated under the'
            f' conditions set by k = {k}, sum = {x_sum}, and low = {low}.'
            f' high automatically set to {x_sum - (k-1)*low}.') 

    if k == 1:
        if high < x_sum:
            raise ValueError(f'The parameter combination k = {k}, sum = {x_sum},'
                ' and high = {high} is impossible.')
        else: return x_sum
    high_i = high
    for i in range(k-1):
        x = random.uniform(low, high_i)
        xs.append(x)
        sum_i = sum_i + x
        if high < (x_sum - sum_i - (k-1-i)*low):
            high_i = high
        else: high_i = x_sum - sum_i - (k-1-i)*low

    xs.append(x_sum - sum_i)

    return xs

For example:

random.seed(0)
xs = sample_floats(low = 0.055, high = 0.5, x_sum = 1., k = 5)
print(xs)
print(sum(xs))

Output:

[0.43076772392864643, 0.27801464913542906, 0.08495210994346317, 0.06568433355884717, 0.14058118343361425]
1.0
yann ziselman
  • 1,952
  • 5
  • 21
  • Doesn't seem to work for `sample_floats(0.01, 1, k=15)` – Mortz Jan 04 '22 at 09:46
  • 1
    @Mortz, sorry, I had an issue I missed. check my implementation now – yann ziselman Jan 04 '22 at 09:53
  • Hi @yannziselman, thanks for suggestion! Looks like it works properly, but still, higher `k`, higher amount of values close to `low`. Can you suggest solution generating truly random distribution? – ker_laeda86 Jan 04 '22 at 10:05
  • 1
    Think of it like you're breaking a strand of pasta to k pieces. If you break off bigger pieces at the start, you're gonna end up with a lot of pieces that are close to the minimal length at the end – yann ziselman Jan 04 '22 at 10:18
1

You could generate k-1 numbers iteratively by varying the lower and upper bounds of the uniform random number generator - the constraint at any iteration being that the number generated allows the rest of the numbers to be at least low

def sample_floats(low, high, k=1):
    result = []
    generated = 0
    while generated < k-1:
        current_higher_bound = max(low, 1 - (k - 1 - generated)*low - sum(result))
        next_num = random.uniform(low, current_higher_bound)
        result.append(next_num)
        generated += 1
    last_num = 1 - sum(result)
    result.append(last_num)
    return result

print(sample_floats(0.01, 1, k=15))
#[0.08878760926151083,
# 0.17897435239586243,
# 0.5873150041878156,
# 0.021487776792166513,
# 0.011234379498998357,
# 0.012408564286727042,
# 0.015391011259745103,
# 0.01264921242128719,
# 0.010759267284382326,
# 0.010615007333002748,
# 0.010288605412288477,
# 0.010060487014659121,
# 0.010027216923973544,
# 0.010000064276203318,
# 0.010001441651377285]
Mortz
  • 4,654
  • 1
  • 19
  • 35
  • Hi @Mortz, thanks for suggesting! It works, but why it results of so many similar values - it's because it distributed uniformly? Can you make it to generate truly random floats within a range (low, high)? – ker_laeda86 Jan 04 '22 at 10:00
  • It results in so many similar values because progressively the range from which the random numbers are being picked gets narrower. One way around this problem is to try and generate numbers closer to the lower bound in each iteration - for example by replacing `next_num = random.uniform(low, current_higher_bound)` with something like `next_num = random.triangular(low, current_higher_bound, (low + current_higher_bound) / 20)` - where the factor of `20` ensures the number is closer to the lower bound – Mortz Jan 04 '22 at 10:25
1

IIUC, you want to generate an array of k values, with minimum value of low=0.055.

It is easier to generate numbers from 0 that sum up to 1-low*k, and then to add low so that the final array sums to 1. Thus, this guarantees both the lower bound and the sum.

Regarding the high, I am pretty sure it is mathematically impossible to add this constraint as once you fix the lower bound and the sum, there is not enough degrees of freedom to chose an upper bound. The upper bound will be 1-low*(k-1) (here 0.505).

Also, be aware that, with a minimum value, you necessarily enforce a maximum k of 1//low (here 18 values). If you set k higher, the low bound won't be correct.

# parameters
low = 0.055
k = 10

a = np.random.rand(k)
a = (a/a.sum()*(1-low*k))
weights = a+low

# checking that the sum is 1
assert np.isclose(weights.sum(), 1)

Example output:

array([0.13608635, 0.06796974, 0.07444545, 0.1361171 , 0.07217206,
       0.09223554, 0.12713463, 0.11012871, 0.1107402 , 0.07297022])
mozway
  • 194,879
  • 13
  • 39
  • 75
  • how do you take into account the upper bound 'high'? – yann ziselman Jan 04 '22 at 10:34
  • @yannziselman as explained in the answer, you can't set it if you already set `low` and `k` as once you chose 2 parameters, the last one is fixed. This is just math, you cannot do anything against it ;) – mozway Jan 04 '22 at 10:38
  • 1
    An elegant solution, seems like which works exactly as I want. Indeed, I'm ain't concerned about `high`, assuming that `sum=1` would take care about it automatically. Thank you very much! – ker_laeda86 Jan 04 '22 at 10:46