1

I have a Lambda function that is generating a random number from 0 to 22 on every invocation. However, I don't feel like the number is truly random, as I often get the same number back in succession when I quickly run the function multiple times.

The runtime is nodejs8.10 and I'm simply calling Math.floor(23*Math.random()) to generate the number.

To debug this, I ran the function 78 times over a duration of about 20 minutes, downloaded the logs from CloudWatch Logs and put the numbers in a file called numbers.txt:

$ cat numbers.txt | tr "\n" " "
5 22 19 7 14 3 14 19 8 1 15 4 7 17 6 5 19 11 18 17 15 5 0 20 11 20 12 12 14 16 5 13 19 19 10 18 21 19 12 20 8 11 16 19 1 1 4 5 2 5 11 3 20 4 2 12 3 6 2 17 20 11 16 1 20 22 1 21 15 17 1 1 1 2 5 5 13 12

Here's how often each number was generated:

$ cat numbers.txt | sort | uniq -c
1 0
8 1
1 10
5 11
5 12
2 13
3 14
3 15
3 16
4 17
2 18
7 19
4 2
6 20
2 21
2 22
3 3
3 4
8 5
2 6
2 7
2 8

Numbers 1 and 5 were both generated 8 times each, while 9 wasn't generated even once.

Are there any gotchas with randomness in AWS Lambda? Can I do something to get more random numbers?

stefansundin
  • 2,826
  • 1
  • 20
  • 28

2 Answers2

4

I don't think these values look non-random.

Taking 78 samples from 23 values might be suspicious if it WERE consistently uniform. Taking 100,000 samples from 23 values would look suspicious if it were NOT uniform.

Simulating in python to demonstrate.

Your current setup (78 samples of 23 values):

import random
import numpy as np

num_vals = 23
num_samples = 78

results = [0] * num_vals

for i in range(num_samples):
    results[random.randint(0,num_vals - 1)] += 1

plt.bar(np.arange(23), results)

enter image description here

Here the most sampled value was selected 10x more often than the least sampled value. Run the same thing, but change num_samples to 10,000 and it get obviously more uniform (like you'd expect).

enter image description here

So this was one experiment with these values... if I run your setup (78 samples of 23) 10,000 times, it always has a highly skewed sample rate.

num_vals = 23
num_samples = 78
num_tests = 10000

max_minus_min = []

for j in range(num_tests):

    results = [0] * num_vals

    for i in range(num_samples):
        results[random.randint(0,num_vals - 1)] += 1

    max_minus_min.append(max(results) - min(results))

plt.hist(max_minus_min, bins=25)

More than 1/3 of the 10,000 simulations had max - min >= 8, so I don't think your results look that anomalous.

enter image description here

kmh
  • 1,516
  • 17
  • 33
1

The issue you would have is not with AWS Lambda but with the Math.random library. See also this and this related Stack Overflow question. Consider Wikipedia for an introduction to the theory of random number generation.

Alex Harvey
  • 14,494
  • 5
  • 61
  • 97