Select one element from a list using python following the normal distribution

Question

I would like to select one element from a list using python following the normal distribution. I have a list, e.g.,

alist = ['an', 'am', 'apple', 'cool', 'why']

For example, according to the probability density function (PDF) of normal distribution, the 3rd element in the given list should have the largest probability to be chosen.Any suggestions?

Normal distribution is defined for a continuous unbounded variable. In your case you can draw samples from a normal distribution and round them to integers and drop values outside the bounds, for example. This will be a normal-ish distribution, which may be what you want. — fjarri, Feb 18 '16 at 03:51
Do be aware that a normal distribution does not have lower or upper bounds on output; it is vanishingly unlikely, but you *could* get a +20 sigmas value back. — Hugh Bothwell, Feb 18 '16 at 03:53
There isn't any such thing as *the* normal distribution. It is a two-parameter family of distributions. You seem to want the mean to be the midpoint of your list, but that still leaves the variance up in the air. For some choices of variance, the choices would be virtually indistinguishable from uniformly chosen. For other choices, you would be returning the middle element almost always. You really need to clarify just what you want. — John Coleman, Feb 18 '16 at 03:58

Hugh Bothwell · Accepted Answer · 2016-02-18T04:05:35.590

from random import normalvariate

def normal_choice(lst, mean=None, stddev=None):
    if mean is None:
        # if mean is not specified, use center of list
        mean = (len(lst) - 1) / 2

    if stddev is None:
        # if stddev is not specified, let list be -3 .. +3 standard deviations
        stddev = len(lst) / 6

    while True:
        index = int(normalvariate(mean, stddev) + 0.5)
        if 0 <= index < len(lst):
            return lst[index]

then

alist = ['an', 'am', 'apple', 'cool', 'why']
for _ in range(20):
    print(normal_choice(alist))

gives

why
an
cool
cool
cool
apple
cool
apple
am
am
apple
apple
apple
why
cool
cool
cool
am
am
apple

AChampion · Answer 2 · 2023-02-19T16:17:42.073

Are you sure you really want a normal distribution, you could look at a Beta Distribution, which would probably give you what you need, e.g.:

>>> import random
>>> from collections import Counter
>>> alist = ['an', 'am', 'apple', 'cool', 'why']
>>> Counter(alist[int(random.betavariate(2, 2)*len(alist))] for _ in range(100))
Counter({'am': 20, 'an': 9, 'apple': 34, 'cool': 23, 'why': 14})
>>> Counter(alist[int(random.betavariate(10, 10)*len(alist))] for _ in range(1000))  
Counter({'am': 183, 'apple': 621, 'cool': 189, 'why': 4, 'an': 3})

Select one element from a list using python following the normal distribution

2 Answers2

Linked