Use itertools.permutations()
on an increasing number of repeated digits, up to 9, combining these with all digits between 1 and 9 (to prevent generating numbers with a leading 0).
from itertools import permutations
def generate_unique_numbers():
yield 0
for i in range(10):
for leading in '123456789':
if not i: # 1-digit numbers
yield int(leading)
continue
remaining_digits = '0123456789'.replace(leading, '')
for combo in permutations(remaining_digits, i):
yield int(leading + ''.join(combo))
This generates all such valid numbers without having to skip anything. There are 8877691 such numbers, ranging from 0 to 9876543210:
>>> sum(1 for _ in generate_unique_numbers())
8877691
>>> next(generate_unique_numbers())
0
>>> for i in generate_unique_numbers(): pass
...
>>> i
9876543210
A few samples of the output:
>>> from itertools import islice
>>> gen = generate_unique_numbers()
>>> list(islice(gen, 15))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15]
>>> list(islice(gen, 150, 165))
[204, 205, 206, 207, 208, 209, 210, 213, 214, 215, 216, 217, 218, 219, 230]
>>> list(islice(gen, 100000, 100015))
[542319, 542360, 542361, 542367, 542368, 542369, 542370, 542371, 542376, 542378, 542379, 542380, 542381, 542386, 542387]
>>> list(islice(gen, 1000000, 1000015))
[31279056, 31279058, 31279064, 31279065, 31279068, 31279084, 31279085, 31279086, 31279405, 31279406, 31279408, 31279450, 31279456, 31279458, 31279460]
This method is easily faster than generating all numbers with range(9876543211)
then filtering out those with repeated digits (which is what Moses Koledoye is doing):
>>> from timeit import timeit
>>> from itertools import islice
>>> def consume_n(it, n): next(islice(it, n, n), None)
...
>>> timeit('consume_n(gen(), 10000)', 'from __main__ import consume_n, unique_count as gen', number=100)
1.825788974761963
>>> timeit('consume_n(gen(), 10000)', 'from __main__ import consume_n, generate_unique_numbers as gen', number=100)
0.6307981014251709
The above code generates just the first 10000 numbers for each approach, and repeats those tests 100 times. My approach is easily 3 times faster!
Increase the count (and adjust the number of repetitions down to keep it manageable), and the contrast grows:
>>> timeit('consume_n(gen(), 100000)', 'from __main__ import consume_n, unique_count as gen', number=10)
4.125269889831543
>>> timeit('consume_n(gen(), 100000)', 'from __main__ import consume_n, generate_unique_numbers as gen', number=10)
0.6416079998016357
Now the difference has grown to 6x faster. Generating the first million numbers just once takes 23 seconds with the range
version, .67 seconds with the above generator:
>>> timeit('consume_n(gen(), 1000000)', 'from __main__ import consume_n, unique_count as gen', number=1)
23.268329858779907
>>> timeit('consume_n(gen(), 1000000)', 'from __main__ import consume_n, generate_unique_numbers as gen', number=1)
0.6738729476928711
And the further along the series you go, the more natural numbers must be skipped and the speed difference only grows further; all numbers in the range 8800000000-8899999999 must be skipped for example, and the range()
approach will test all of those. That's 100 million wasted cycles!
The generator can produce all possible such numbers in 6.7 seconds on my laptop:
>>> from collections import deque
>>> def consume(it): deque(it, maxlen=0)
...
>>> timeit('consume(gen())', 'from __main__ import consume, generate_unique_numbers as gen', number=1)
6.6731719970703125
I didn't dare test how long the range()
approach takes; there are just under 9 million such numbers, but the range()
approach will test close to 10 billion possibilities, 10000 times more numbers than needed.