Optimize Python Loop For Brute-Force Password Cracker

Question

I am working on a python brute-force password cracker. Currently, my list of characters contains around 92 characters, so for every digit in the password, the amount of possibilities is increased 92 times. A 5-digit password already contains almost 6.6 billion combinations. This takes my script literally hours to run, so I am wondering if there are any ways to increase the speed of my script? I have heard of multi-threading, but I am not sure of how I could implement that into my code. If anyone has any ideas on how to increase the speed of this, that would be happy to hear.

flag2=0
for password_length in range(1, 35):
    for guess in itertools.product(nums + lower + upper + char, repeat=password_length):
        guess = ''.join(guess)
        enc_word = guess.encode('utf-8')
        digest = hashlib.md5(enc_word.strip()).hexdigest()
        
        if digest == pass_hash:
            print("password is: " + guess)
            flag2=1
            
            break;
    if flag2==1:
        break;
        
if flag2==0:
    print("password is over 35 characters long")

Regarding threads: You could have one thread per password length. Your current code uses one thread for all lengths — OneCricketeer, Dec 07 '20 at 22:41
The whole point of brute force is that takes forever. If a method was fast, it would not be brute force. — M-Chen-3, Dec 07 '20 at 22:44
You can use a "multiprocessing.Pool". Each task can be defined e. g. by password length and first password character. — Michael Butscher, Dec 07 '20 at 22:45
If you have an nvidia gpu, you could look into [using CUDA for md5 hashing in Python](https://stackoverflow.com/a/10048545/11789440). The downside is you either have to learn to use pycuda (preferable for you) or write non-python code directly for CUDA. — thshea, Dec 07 '20 at 22:47

Reynier Garcia · Answer 1 · 2020-12-08T00:13:26.917

This is a rather complicated and extensive problem to discuss. The problem is, using multi-threading, only gets you so far. Why? Because the standard implementation of Python, CPython, uses something known as the GIL (Global Intrepreter Lock), that is a Mutex that prevents multiple threads from executing Python bytecodes at once, preventing multithreaded CPython programs from taking full advantage of multiprocessor systems in certain situations. It has been stated, that this is not a big problem when tasks executing in multiple threads are I/O Bound (it means that most of the time they are waiting for an input-output result such as a call to a service through the network or reading a file from the disk or consulting a Database) because in that case, the corresponding thread releases the GIL, allowing another thread to be executed; but in the end, only one thread will be executed at a time. Also, the indiscriminate creation of threads has its own perils, because context switching of them by the CPU when you have created a lot of threads is a very expensive task. Because of that, some times is better to use async programming, that only use one execution thread; but unfortunately when the tasks to be executed are CPU bound, like in your case, async programming is not of much utility.

Taking into account the above, and that the work you need to execute is CPU bound (since it is an intensive operation that does not depend on I/O), what would improve the execution speed, is using multiprocessing and/or distributed computing. The benefits obtained in execution speed, by using multiprocessing, will depend on how many cores or CPUs your machine has. Also you must take into account that the communication between processes will also insert its own delays. Using distributed computing would be the best option (as long as you have enough PCs in a cluster to handle the workload). In any case, one of the challenges that you will face, is how to distribute the work load among the different processes/machine in the cluster.

There're several ways of multiprocessing and distributed computing in Python, the following are some references for your further research:

Speed Up Your Python Program With Concurrency
What Is the Python Global Interpreter Lock (GIL)?
Modern Parallel and Distributed Python: A Quick Tutorial on Ray

score 0 · Answer 2 · answered Dec 08 '20 at 15:09

I have attempted to add both multithreading and multiprocessing separately and both of them did not help. The multithreading slowed down the software and the multiprocessing broke multiple times. I am working on a new system of password cracking that uses the user's information. Many people who create passwords use some form of information including the following: Username, pin, birthday, first name, last name, middle name, initials, or some important day

Also, if you can get one person's password, you may have all of their passwords. Statistically, people use the same or similar passwords for everything. It is easier to make a phishing scam than to guess a password. I am thanking everyone for their feedback.

score 0 · Answer 3 · answered Jun 05 '22 at 04:47

Ask the user if they know the length of the password, and only use passwords of that length, and also ask if they know if there are special characters, uppercase, etc. This will narrow down your list from 92 to something like 40-50 most often. This drastically increases the speed of the program. You might also want to consider, to not repeat passwords, and also look into MD5 Hash Collisions, where two or more strings result in the same hash, breaking the hash, so you may want to have a list of hashes that result in the same password and thus you do not need to try one if you've already done one that results in the same hash.

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). — Community, Jun 05 '22 at 06:10

Optimize Python Loop For Brute-Force Password Cracker

3 Answers3