Split number in randomly-sized portions in Python

Question

I have x = 10 and y = 100.

Can I distribute y elements in randomly-sized portions among x 'element holders'?

I want to create x categories each with a random number of items; however, the number of items created should be exactly y.

I guess it's something like

# number of categories and items
x = 10, y = 100

# keep track of how many items we have left to add
y_left = y

# create all categories
for i in range(x):
    # create category

    # find number of items in this category
    num_items_in_category = random.randint(1, y_left)

    # create items
    for j in range(num_items_in_category):
        # create item

    # set new number of items left to add
    y_left -= num_items_in_category

Instead of directly creating objects in groups of the sizes of the "portions", why not just create all of the `y` objects in a single loop and assign each one to a random bucket from the `x` selection immediately after creation? — Alex Celeste, May 02 '15 at 16:05
It's a much smarter idea. But how can I retrieve a random object among the newly created companies? — Jamgreen, May 02 '15 at 16:30
it will be random among all categories and not just those I've created in same routine — Jamgreen, May 02 '15 at 16:55

score 1 · Answer 1 · edited May 23 '17 at 12:21

Using the functions from this answer you can generate a list of x random numbers that sum to y.

Iterate over the items of this list and make that many random choices (with removal) from a population of elements for each holder.

Or by example:

# requires RandIntVec() & RandFloats() from linked answer

# whatever population you are choosing from, example letter
population = list('qwertyuiopasdfghjklzxcvbnmQWERTYUIOPASDFGHJKLZXCVBNM1234567890')

# sizes (whatever you want)
num_of_holders = 7
total_elements = len(population)

# empty holder to put results in
element_holder = [[] for _ in xrange(num_of_holders)]

# distribute total_elements elements in randomly-sized portions among num_of_holders 'element holders' 
random_portions_in_holder = RandIntVec(num_of_holders, total_elements, Distribution=RandFloats(num_of_holders))

# assign each portion of elements to each 'holder'
for h, portion in enumerate(random_portions_in_holder):
    for p in range(portion):
        index = random.randrange( len(population) )
        element_holder[h].append(population.pop(index))

# display
print 'Randomly-portioned elements'
for h in element_holder:
    print h

# verify
print '\nMatch desired result?'
print 'total_elements      :', total_elements
print 'elements in holders :', sum([len(h) for h in element_holder])
print 'match               :', total_elements == sum([len(h) for h in element_holder])

output

Randomly-portioned elements
['M', 'N', 'f', 'V', 'v', 'h', 'i', 'H', '6', '5', 'j', '7', 'r']
['u', 'Z', 'C', 'I', 's', 'm', 'g', 'p', 'q', 'a', 'O', 'T', 'L']
['K', 'E', 'P', 'U']
['Y', 'D', 'A', 'l', 'J', 'R', 'b', 'c', 'z', 'F']
['0', '1', 'o', 'X', 'G', '4', 'W', '3', '2']
['d', 'Q']
['e', 'y', 'B', '8', 'x', 'k', 'w', 't', 'S', 'n', '9']

Match desired result?
total_elements      : 62
elements in holders : 62
match               : True

P.S. I had some indentation errors when I copied the linked functions, you may need to correct them. The code under elif Distribution.lower() == 'normal': needs to by un-indented by 1 level. I submitted an edited version (ending approval), so depending on when you copy, you may or may not need to edit.

score 1 · Answer 2 · answered May 02 '15 at 18:23

This isn't actually too difficult. Let's create our containers:

import random

num_containers = 10
num_objects = 100

containers = [[] for _ in range(num_containers)]
objects = (some_object() for _ in range(num_objects))
# we don't need a list of these, just have to iterate over it, so this is a genexp

for object in objects:
    random.choice(containers).append(object)

score 0 · Answer 3 · answered Dec 12 '18 at 20:33

I encountered a similar problem in my work and solved the problem using dictionary.

import random
# number of categories (number of keys in the dictionary) and items (sum of all the value)
x, y = 10, 100

snpDict = {}
i = 0
while i < y:
    a = random.randint(0, x)
    if a in snpDict:
        snpDict[a] += 1
    else:
        snpDict[a] = 1

    i += 1

I have huge number of categories and items, the performance is not good.

Split number in randomly-sized portions in Python

3 Answers3