1

What is the absolute fastest way of achieving this? I will be doing this over 1 million times per day so I want maximum efficiency.

With numpy (avg after 20 runs 0.0001679009692429128)

t0 = time.clock()
print(np.random.randint(1,1000000))
t1 = time.clock()
print (t1-t0)

With random (avg: 0.0000920492372555262)

t2 = time.clock()
print(random.choice(range(1,1000000)))
t3 = time.clock()
print (t3-t2)

To my surprise random was consistently faster than numpy. Is there a faster way?

Max Collier
  • 573
  • 8
  • 24
jinyus
  • 477
  • 7
  • 19
  • 2
    One million times per day is a very small amount for a computer. The method doesn't matter too much here. – user3483203 Jun 11 '18 at 01:24
  • 1
    why don't you generate 1 million numbers once and just read from it when you need to. – Haleemur Ali Jun 11 '18 at 01:25
  • 8
    The time to print overwhelms the time to generate a random number by several orders of magnitude. – Reblochon Masque Jun 11 '18 at 01:26
  • 2
    Calling `random.choices(range(1,1000000), k=1000000)` takes about half a second. If you have to do that once a day it really should not be an issue – Olivier Melançon Jun 11 '18 at 03:03
  • "I want maximum efficiency" -- you probably spent more time writing your timing tests and composing this question than your CPU will spend generating a million numbers a day, even if your program runs every day for the rest of your life. Premature optimization is not an efficient use of programmer time. – John Coleman Jun 11 '18 at 12:09

4 Answers4

3

numpy is more efficient when generating large samples (arrays) of random numbers. For example,

In [10]: %timeit np.random.randint(1,1000000, 1000000)
5.14 ms ± 64.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [11]: %timeit [random.choice(range(1,1000000)) for _ in range(1000000)]
1.01 s ± 14.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In addition, see How can I time a code segment for testing performance with Pythons timeit? on how to perform timing tests. When you use time.clock(), you should at least try to repeat the operation multiple times and then compute mean time. It is more advisable to use timeit for timing tests. Also, as others have mentioned in the comments, print() takes significantly longer that random number generation, so your timing test is mostly measuring how fast print() works. Instead, you should do something like this:

In [12]: repeat = 1000000
    ...: t0 = time.clock()
    ...: for _ in range(repeat):
    ...:     np.random.randint(1, 1000000)
    ...: t1 = time.clock()
    ...: print((t1 - t0) / repeat)
1.3564629999999908e-06

In [13]: repeat = 1000000
    ...: t2 = time.clock()
    ...: for _ in range(repeat):
    ...:     random.choice(range(1, 1000000))
    ...: t3 = time.clock()
    ...: print((t3 - t2) / repeat)
1.0206699999999956e-06

So, for a single number, numpy is on average just about 35% slower than built-in random number generator. However, previous tests show that when generating large samples, numpy is significantly faster.

AGN Gazer
  • 8,025
  • 2
  • 27
  • 45
1

I wrote a test program. It showed that it only cost 1 second to complete your task. So just write whatever way you want, it will not be your bottleneck.

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Xiang Wang @ 2018-05-23 16:49:00

import time
import random

start = time.time()

for i in range(1000000):
    random.randint(1, 1000000)

end = time.time()

print("total time: {}".format(end-start))

enter image description here

ramwin
  • 5,803
  • 3
  • 27
  • 29
1

If you are using numpy, it is more efficient to generate all the values you need at once with random.random_integers. Both python and numpy use Mersenne Twister. More info: Differences between numpy.random and random.random in Python

qwr
  • 9,525
  • 5
  • 58
  • 102
0

random.getrandbits seems to be much faster than other random module tools.

%timeit random.randint(0,1000000)                                                                                                                                                                                                                                     
799 ns ± 2.45 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit random.choice(range(0,1000000))                                                                                                                                                                                                                               
742 ns ± 13.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

%timeit random.getrandbits(20)                                                                                                                                                                                                                                        
83.9 ns ± 1.61 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Read this for further information: https://eli.thegreenplace.net/2018/slow-and-fast-methods-for-generating-random-integers-in-python/

remort
  • 304
  • 3
  • 9