2

Looking for a quick way to create L amount of lists of n amount of decimals whose sum is 1. Each number should be >= 0.01

Desired Output:

where L = 200, n = 6

[0.20, 0.22, 0.10, 0.06, 0.04, 0.38]
[0.32, 0.23, 0.18, 0.07, 0.05, 0.15]
...
# There are 200 of these

where L = 200, n = 3

[0.90, 0.10, 0.10]
[0.35, 0.25, 0.30]
...
# There are also 200 of these

The tricky part that I can't think of a practical solution to is ensuring there are no zeros in each list. This becomes especially hard when n reaches large numbers. How do you divvy up the fragments of the value 1 accordingly?

Paul
  • 26,170
  • 12
  • 85
  • 119
Alex McLean
  • 2,524
  • 5
  • 30
  • 53
  • In what way should it be quick – quick for typing? Quickly understandable for a reader? Quick for you to copy-paste from StackOverflow? – lenz Aug 09 '15 at 22:45
  • And what does perfectly random mean? – lenz Aug 09 '15 at 22:46
  • And what does "no zero" mean? Are you okay with `0.00000000001`? – mkrieger1 Aug 09 '15 at 22:46
  • Are the numbers randomly generated? are the numbers always fixed to 2 decimal places? are all list entries different? is this homework? – fixmycode Aug 09 '15 at 22:46
  • @ mkriger1 good question. Should have been more clear, all numbers should be >= 0.01. – Alex McLean Aug 09 '15 at 22:48
  • @lenz Perfectly random as in the probability of getting 0.99 or 0.01 should be the same as getting 0.50 – Alex McLean Aug 09 '15 at 22:49
  • @lenz, also, quick as in the speed for the computer to generate the list – Alex McLean Aug 09 '15 at 22:49
  • Smells like you want to randomly generate loaded dices. Am i right? – Dleep Aug 09 '15 at 22:52
  • @McLean25 can the numbers have more sig-figs than 2, i.e. is 0.012 allowed or are you only accepting the 0.01-0.99 in increments of 0.01? And also, do the numbers need to be unique or would say `[0.25, 0.25, 0.25, 0.25]` be allowed for n=4? – isosceleswheel Aug 10 '15 at 00:16

4 Answers4

2

This should be very fast, as it uses numpy.

It will automatically repeat the randomization if it gets any 0.0's, but that is unlikely. The while loop was written before the OP adjusted the non-zero requirement to be above 0.01. To fix this you can modify the while block to include the entire subsequent code, and calculate the number of violations of any desired constraint at the end in a manner similar to what is shown for detecting zeros. But that may get slow when L is large compared to the probability of violating the constraint. In some sense it is easiest to comply with the original requirement of >0.0.

After the while loop, each element of an L x n matrix is uniformly distributed on (0.0,1.0) without any 0s or 1s. Each row is summed and used to form a scale matrix, that is then matrix multiplied by the random matrix to obtain rows that automatically sum to 1.0

 import numpy as np
 def random_proportions(L,n):
      zeros = 1
      while zeros>0:
          x = np.random.random(size=(L,n))
          zeros = np.sum(x==0.0)
      sums = x.sum(axis=1)
      scale = np.diag(1.0/sums)
      return np.dot(scale, x)

EDIT: The above produces an LxL matrix for scale, which is memory-inefficient. It will OOM before L=10**6. We can fix that by using the broadcasting normalization procedure suggested by this answer

import numpy as np
def random_proportions(L,n):
      zeros = 1
      while zeros>0:
          x = np.random.random(size=(L,n))
          zeros = np.sum(x==0.0)
      sums = x.sum(axis=1).reshape(L,1) # reshape for "broadcasting" effect
      return x/sums

This 2nd version will calculate 1 millions lists of size 10 in about 1/3 of a second on an AMD FX-8150 with 16GB ram:

%timeit l = random_proportions(1000000,10)
1 loops, best of 3: 347 ms per loop
Community
  • 1
  • 1
Paul
  • 26,170
  • 12
  • 85
  • 119
1

Here's how you get n numbers that add up to one: Select n random numbers in an arbitrary range of your choice (for example, from 1 to 10), then divide them all by their sum.

mkrieger1
  • 19,194
  • 5
  • 54
  • 65
  • But since all decimals must be >= 0.01, max range'd only be 16, giving less than 16^6 possible lists – Dleep Aug 09 '15 at 22:49
  • I meant that there's a lot of possible combinations that cannot possibly be generated by your algorithm. Can't call it perfectly random if there's no way for, for example, `0,99` to appear – Dleep Aug 09 '15 at 22:53
1

This should do the trick:

import random


def floatPartition(n, total):
    answer = []
    for _ in range(n-1):
        num = random.uniform(0, total)
        answer.append(num)
        total -= num
    answer.append(total)
    return answer


def paritions(n,L):
    return [floatPartition(n, 1) for _ in range(L)]


if __name__ == "__main__":
    answer = paritions(6,200)
inspectorG4dget
  • 110,290
  • 27
  • 149
  • 241
1

I didn't check against the others for speed, but this algorthim produces 1,000,000 lists of length 10 with elements 0.01 - 0.99 in increments of 0.01 in 20 seconds:

import random
def rand_list(n):

    sub_list = []
    max_val  = 100 - n + 1  # max needs to be n-1 less than 100

    for repetition in xrange(n-1):

        sub_list += [random.randrange(1, max_val)]
        max_val  -= (sub_list[-1] - 1)  # decrease the range by the latest element added - 1

    sub_list += [max_val]  # use the remainder for the last value, this way it adds to 100

    return [round(x/100.0, 2) for x in sub_list]  # convert to 0.01 - 0.99 with list comprehension
isosceleswheel
  • 1,516
  • 12
  • 20