-1

i am trying to figure out how to generate a list of numbers that are not only unique but also have no similar digits The purpose is to crack sample hashes using crypt, what i know is the passwords are unique, no repeating digits and up to 6 digits

Example>

[1,2,3,4,5,6,7,8,9...120, 123,124,125,126,127... 123456,123457...]

so 122 or 123451 or 12345673 because there will be repeating numbers

my code

#!/usr/bin/env python

from os import getuid
from crypt import crypt
import time

# all initialisation here
password = range(1000000)
# all initialisation ends here

def VerifyHash(shadowHashes, salt, user):
    # crypt("Message", "Salt")
    for i in password:
        if (shadowHashes == crypt(str(i),salt)):
            print "Password of %s is " % user + str(i)
            break
        else:
            pass

def main():
    with open('shadow.txt') as p:
            for line in p:
                shadowList = line
                shadowHashes = shadowList.split(':')[1]
                user = shadowList.split(':')[0]
                salt = shadowHashes[:12]
                VerifyHash(shadowHashes,salt,user)  

main()

for now i am using range(1000000) and it works but is inefficient because the numbers i'm comparing to are unique with no repeating digits so i want to make my python program more efficient

Please point me in the right direction

CwhatC
  • 5
  • 1
  • 6

3 Answers3

3

Use itertools.permutations() on an increasing number of repeated digits, up to 9, combining these with all digits between 1 and 9 (to prevent generating numbers with a leading 0).

from itertools import permutations

def generate_unique_numbers():
    yield 0
    for i in range(10):
        for leading in '123456789':
            if not i:  # 1-digit numbers
                yield int(leading)
                continue
            remaining_digits = '0123456789'.replace(leading, '')
            for combo in permutations(remaining_digits, i):
                yield int(leading + ''.join(combo))

This generates all such valid numbers without having to skip anything. There are 8877691 such numbers, ranging from 0 to 9876543210:

>>> sum(1 for _ in generate_unique_numbers())
8877691
>>> next(generate_unique_numbers())
0
>>> for i in generate_unique_numbers(): pass
...
>>> i
9876543210

A few samples of the output:

>>> from itertools import islice
>>> gen = generate_unique_numbers()
>>> list(islice(gen, 15))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15]
>>> list(islice(gen, 150, 165))
[204, 205, 206, 207, 208, 209, 210, 213, 214, 215, 216, 217, 218, 219, 230]
>>> list(islice(gen, 100000, 100015))
[542319, 542360, 542361, 542367, 542368, 542369, 542370, 542371, 542376, 542378, 542379, 542380, 542381, 542386, 542387]
>>> list(islice(gen, 1000000, 1000015))
[31279056, 31279058, 31279064, 31279065, 31279068, 31279084, 31279085, 31279086, 31279405, 31279406, 31279408, 31279450, 31279456, 31279458, 31279460]

This method is easily faster than generating all numbers with range(9876543211) then filtering out those with repeated digits (which is what Moses Koledoye is doing):

>>> from timeit import timeit
>>> from itertools import islice
>>> def consume_n(it, n): next(islice(it, n, n), None)
...
>>> timeit('consume_n(gen(), 10000)', 'from __main__ import consume_n, unique_count as gen', number=100)
1.825788974761963
>>> timeit('consume_n(gen(), 10000)', 'from __main__ import consume_n, generate_unique_numbers as gen', number=100)
0.6307981014251709

The above code generates just the first 10000 numbers for each approach, and repeats those tests 100 times. My approach is easily 3 times faster!

Increase the count (and adjust the number of repetitions down to keep it manageable), and the contrast grows:

>>> timeit('consume_n(gen(), 100000)', 'from __main__ import consume_n, unique_count as gen', number=10)
4.125269889831543
>>> timeit('consume_n(gen(), 100000)', 'from __main__ import consume_n, generate_unique_numbers as gen', number=10)
0.6416079998016357

Now the difference has grown to 6x faster. Generating the first million numbers just once takes 23 seconds with the range version, .67 seconds with the above generator:

>>> timeit('consume_n(gen(), 1000000)', 'from __main__ import consume_n, unique_count as gen', number=1)
23.268329858779907
>>> timeit('consume_n(gen(), 1000000)', 'from __main__ import consume_n, generate_unique_numbers as gen', number=1)
0.6738729476928711

And the further along the series you go, the more natural numbers must be skipped and the speed difference only grows further; all numbers in the range 8800000000-8899999999 must be skipped for example, and the range() approach will test all of those. That's 100 million wasted cycles!

The generator can produce all possible such numbers in 6.7 seconds on my laptop:

>>> from collections import deque
>>> def consume(it): deque(it, maxlen=0)
...
>>> timeit('consume(gen())', 'from __main__ import consume, generate_unique_numbers as gen', number=1)
6.6731719970703125

I didn't dare test how long the range() approach takes; there are just under 9 million such numbers, but the range() approach will test close to 10 billion possibilities, 10000 times more numbers than needed.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thanks for the comment! appreciate it, tried it but realized @Moses Koledoye's code is shorter and does the same thing! (: – CwhatC Jul 27 '16 at 16:36
  • @CwhatC: Added performance numbers. You are free to go with Moses' approach, but it'll take **way** longer to complete. You wanted something more efficient you said? – Martijn Pieters Jul 27 '16 at 17:29
  • i will test out your method now, thanks for the feedback! i really appreciate the help and will try to learn as much as possible about the 2 approaches – CwhatC Jul 28 '16 at 11:13
3

You could check if the length of the number when represented as a string is the same when the digits in the number are placed in a set:

def unique_count():
    for i in range(1000000):
        if len(str(i)) == len(set(str(i))):
            yield i

The set eliminates duplicate digits, so numbers that do not have duplicated digits will pass the condition.

Moses Koledoye
  • 77,341
  • 8
  • 133
  • 139
  • This is works but is *hugely* wasteful as it'll go on to test 10k times more numbers than need to be produced, and as a result is about 3 times slower at generating the first 10000 numbers from my solution, getting slower and slower as the numbers get bigger. Note that you can [calculate the number of digits](http://stackoverflow.com/questions/2189800/length-of-an-integer-in-python) rather than use a set. Not that that'll help avoid the huge wastage. – Martijn Pieters Jul 27 '16 at 17:36
  • (and apologies, I had at first tested with `range()` on Python 2, rather than `xrange()`). – Martijn Pieters Jul 27 '16 at 17:53
  • @MartijnPieters No qualms. Thanks for the digit count alternative. You are right about the frequency of a number being skipped increasing with the size of the numbers. I didn't give this much thought. I bet other users who search this question will find your solution a far more efficient alternative – Moses Koledoye Jul 27 '16 at 17:59
0

You can use the generate function below for a maximum number i then it will generate a liste of digits smaller then i unique, and without a digit repeated.

def isDigitsRepeated(integer):
    n = str(integer)
    digits = []
    for digit in n:
        if digit in digits:
            return True
        digits += [digit]
    return False

def generate(maximum):
    lst = []
    for x in range(maximum):
        if not isDigitsRepeated(x):
            lst += [x]
    return lst