1

I have built a multiprocessing password cracker (using a wordlist) for a specific function, it halved the time needed compared to using a single process.

The original problem being that it would show you the cracked password and terminate the worker, but the remaining workers would carry on until they ran out of words to hash! not ideal.

My new step forward is to use Manager.Event() to terminate the remaining workers, this works as I had hoped (after some trial and error), but the application now takes far longer that it would take as a single process, I'm sure this must be due to the if function inside pwd_find() but I thought I would seek some advice.

#!/usr/bin/env python

import hashlib, os, time, math
from hashlib import md5
from multiprocessing import Pool, cpu_count, Manager

def screen_clear(): # Small function for clearing the screen on Unix or Windows
    if os.name == 'nt':
        return os.system('cls')
    else:
        return os.system('clear')

cores = cpu_count() # Var containing number of cores (Threads)

screen_clear()

print ""
print "Welcome to the Technicolor md5 cracker"
print ""

user = raw_input("Username: ")
print ""
nonce = raw_input("Nonce: ")
print ""
hash = raw_input("Hash: ")
print ""
file = raw_input("Wordlist: ")
screen_clear()
print "Cracking the password for \"" + user + "\" using " 
time1 = time.time() # Begins the 'Clock' for timing

realm = "Technicolor Gateway" # These 3 variables dont appear to change
qop = "auth"
uri = "/login.lp"

HA2 = md5("GET" + ":" + uri).hexdigest() # This hash doesn't contain any changing variables so doesn't need to be recalculated

file = open(file, 'r') # Opens the wordlist file
wordlist = file.readlines() # This enables us to use len()
length = len(wordlist)

screen_clear()
print "Cracking the password for \"" + user + "\" using " + str(length) + " words"

break_points = []  # List that will have start and stopping points
for i in range(cores):  # Creates start and stopping points based on length of word list
    break_points.append({"start":int(math.ceil((length+0.0)/cores * i)), "stop":int(math.ceil((length+0.0)/cores * (i + 1)))})

def pwd_find(start, stop, event):
    for number in range(start, stop):
        if not event.is_set():
            word = (wordlist[number])
            pwd = word.replace("\n","") # Removes newline character
            HA1 = md5(user + ":" + realm + ":" + pwd).hexdigest()
            hidepw = md5(HA1 + ":" + nonce +":" + "00000001" + ":" + "xyz" + ":" + qop + ":" + HA2).hexdigest()
            if hidepw == hash:
                screen_clear()
                time2 = time.time() # stops the 'Clock'
                timetotal = math.ceil(time2 - time1) # Calculates the time taken
                print "\"" + pwd + "\"" + " = " + hidepw + " (in " + str(timetotal) + " seconds)"
                print ""
                event.set()
                p.terminate
                p.join
        else:
            p.terminate
            p.join

if __name__ == '__main__':  # Added this because the multiprocessor module sometimes acts funny without it.

    p = Pool(cores)  # Number of processes to create.
    m = Manager()
    event = m.Event()
    for i in break_points:  # Cycles though the breakpoints list created above.
        i['event'] = event
        a = p.apply_async(pwd_find, kwds=i, args=tuple())  # This will start the separate processes.
    p.close() # Prevents any more processes being started
    p.join() # Waits for worker process to end

if event.is_set():
    end = raw_input("hit enter to exit")
    file.close() # Closes the wordlist file
    screen_clear()
    exit()
else:
    screen_clear()
    time2 = time.time() # Stops the 'Clock'
    totaltime = math.ceil(time2 - time1) # Calculates the time taken
    print "Sorry your password was not found (in " + str(totaltime) + " seconds) out of " + str(length) + " words"
    print ""
    end = raw_input("hit enter to exit")
    file.close() # Closes the wordlist file
    screen_clear()
    exit()

Edit (for @noxdafox):

def finisher(answer):
    if answer:
        p.terminate()
        p.join()
        end = raw_input("hit enter to exit")
        file.close() # Closes the wordlist file
        screen_clear()
        exit()

def pwd_find(start, stop):
    for number in range(start, stop):
        word = (wordlist[number])
        pwd = word.replace("\n","") # Removes newline character
        HA1 = md5(user + ":" + realm + ":" + pwd).hexdigest()
        hidepw = md5(HA1 + ":" + nonce +":" + "00000001" + ":" + "xyz" + ":" + qop + ":" + HA2).hexdigest()
        if hidepw == hash:
            screen_clear()
            time2 = time.time() # stops the 'Clock'
            timetotal = math.ceil(time2 - time1) # Calculates the time taken
            print "\"" + pwd + "\"" + " = " + hidepw + " (in " + str(timetotal) + " seconds)"
            print ""
            return True
        elif hidepw != hash:
            return False

if __name__ == '__main__':  # Added this because the multiprocessor module sometimes acts funny without it.

    p = Pool(cores)  # Number of processes to create.
    for i in break_points:  # Cycles though the breakpoints list created above.
        a = p.apply_async(pwd_find, kwds=i, args=tuple(), callback=finisher)  # This will start the separate processes.
    p.close() # Prevents any more processes being started
    p.join() # Waits for worker process to end
Andy
  • 77
  • 2
  • 9
  • [This](http://stackoverflow.com/questions/14579474/multiprocessing-pool-spawning-new-childern-after-terminate-on-linux-python2-7) might be helpful – sgrg Nov 13 '15 at 00:08

2 Answers2

1

I think your hunch is correct. You are checking a synchronization primitive inside a fast loop. I would maybe only check if the event is set every so often. You can experiment to find the sweet spot where you check it enough to not do too much work but not so often that you slow the program down.

Barry Rogerson
  • 598
  • 2
  • 15
  • If i were to move the `if not event.is_set():` to outside the `for number in range(start, stop):` and increase the `cores` variable (increasing the number of threads) would all the threads try and start at the same time or would they wait until space was free (I don't fully understand how this multiprocessing works), if they wait then that should prevent them starting once an answer is found - am i correct? I will have a go once i am home. – Andy Nov 13 '15 at 17:05
  • All worker threads will start at the same time but each worker will work on a function and won't pick up a new one until it either finishes or the function releases the GIL. Seeing as your code only schedules one task per worker this point is moot. If you over schedule your machine such that you make many more workers than is sensible they will have to time share their allocation of the CPU and could end up going slow. But the number of worker threads is thing you can play with to see how to make it go as fast as possible. – Barry Rogerson Nov 13 '15 at 20:14
1

You can use the Pool primitives to solve your problem. You don't need to share an Event object which access is synchronised and slow.

Here I give an example on how to terminate a Pool given the desired result from a worker.

You can simply signal the Pool by returning a specific value and terminate the pool within a callback.

Community
  • 1
  • 1
noxdafox
  • 14,439
  • 4
  • 33
  • 45
  • Thank you @noxdafox , I tried this method previously but couldn't get it to work, I thought i would have another go using your post as an example to follow, I ran into the same problem again - using `pdb` I can see that it runs through the callback function once or twice then terminates even though `answer` never gets called as `True`. his is the function that is called through callback: `def finisher(answer): if answer: p.terminate p.join end = raw_input("hit enter to exit") file.close() screen_clear() exit()` – Andy Nov 14 '15 at 14:05
  • I apologise for how messy my comment looks, I don't know the etiquette for posting updated code! – Andy Nov 14 '15 at 14:07
  • Not sure you can post code in comments, just add it to the original post. Please post the code you tried. – noxdafox Nov 14 '15 at 14:09
  • sorry for the delay, please see updated example on the end of my original post. – Andy Nov 15 '15 at 19:14
  • 1
    Terminate and join are functions. Not attributes. `p.terminate()` and `p.join()`. Also do no use `raw_input` within the callback as it's executed asynchronously. Just place it within the main loop (after `close` and `join`). – noxdafox Nov 15 '15 at 20:03
  • wow that was a silly mistake, thanks for pointing that out, although that is a good point (and i will modify it accordingly) it never makes it far enough to actually run that code, it terminates all workers almost instantly when using callback and ends as if it has run through the entire wordlist without finding a correct answer but only in 3 secconds instead of 40 (even though I have given it a simple hash that it should find quickly!) – Andy Nov 15 '15 at 20:16
  • 1
    Looking at your worker loop it returns immediately. Move the `if: ... False` out from the for loop. – noxdafox Nov 15 '15 at 21:22