1

For the past few days I have taken a challenge to write an algorithm in Python 2.7, to find a passphrase given a few hints.

Specifically:

  • An anagram of the passphrase is: "poultry outwits ants"
  • The MD5 hash of the secret phrase is "4624d200580677270a54ccff86b9610e"
  • A Wordlist

What approaches would you use to find the password? I know that brute-forcing all the possible combinations is the sure way but also the one taking way too long to complete (never, i think it is around 20^20 possible combinations).

What I have come up with is to filter the wordlist based on the letters that exist in the anagram and the words. Meaning, if I find a word that contains a letter that is not in the passphrase, I discard it. In addition, I wanted to take into consideration the frequencies of the characters. So I removed from the wordlist any word that has characters with frequency higher than the character frequency of the passphrase. The above was performed on word-level, meaning I checked each word individual. Eventually, from 90k+ words from the wordlist I narrowed them down to 1700 unique words that can be used for the passphrase.

Question 1: Is there anything more that can be done in order to reduce the number of possibilities? Is there a more clever way to take into account the frequencies of the letters?

Next, I thought that since I narrowed it down to 1.7k words maybe I could try permutations (itertools.permutations()) of these words that could possible match the md5 hash of passphrase. Since the anagram of the passphrase contains 2 space characters, I assumed that the passphrase is also three words and not just scrambled characters (maybe I am wrong, but at least I should try it first). As a result, I tried checking permutations of three words from the filtered wordlist. As it turned out, neither that kind of approach is fast enough for my laptop to get any results. The program reaches the memory limit, and the computer freezes.

I also thought about taking into consideration pairs of words instead of triplets and somehow match the letter frequencies in order to filter out some possibilities but I did not come up with a way to do it yet.

Question 2: Is there a way that I can get any more information about the passphrase without checking all the permutations, since this task is prohibitive for my laptop (at least for 1.7k words and above).

I tried using hashcat but I found it too complicated. I ran a couple mask attacks on the md5 hash but with no success. I tried brute-forcing it too, since it can use the GPU but it was still impossible. The main reason was that I could not understand the kind of arguments I needed to give. I know there is an extensive wiki, but as someone with close to no background about hash cracking it was not really helpful. In addition, I would prefer if there was a way to do it on my own and use other programs as less as possible.

If you have any suggestions about solving this please let me know. I am doing this for education purposes, so any input on this will be greatly appreciated.

Thanks

sokras
  • 629
  • 1
  • 8
  • 19

0 Answers0