0

I'm working on a project in MATLAB that aims to break a user-inputted password using brute force. I've successfully set it up to break passwords up to eight characters in length, but I want it to be able to work for passwords of any length. As it stands, the code basically checks all possible one character passwords, then all possible two character passwords, etc. (Note: realpass is the user-inputted password, guess is the computer-generated guess, alphasize is the length of the alphabet of characters I give MATLAB to check at the start of the script, i.e., the number of possible characters.)

% check all 1 character passwords possible w/ given alphabet
if strcmp(guess, realpass) == 0
    for i = 1:alphasize(2)
        guess(1) = alphabet(i);
        if strcmp(guess, realpass) == 1
            break
        end
    end
end

% the password has more than one characters, check all possible 2 character passwords
if strcmp(guess, realpass) == 0
    for i = 1:alphasize(2)
        guess(1) = alphabet(i);
        for j = 1:alphasize(2)
            guess(2) = alphabet(j);
            if strcmp(guess, realpass) == 1
                break
            end
        end
        if strcmp(guess, realpass) == 1
            break
        end
    end
end

As you could probably tell, to get this out to 8 characters, there's a lot of copy/pasting, which is an excellent indicator that a loop can be used. My trouble is getting the loop to behave right. You can see my attempt on github. Does anyone have any advice for getting this thing up and running?

Thanks so much!

wbadart
  • 2,583
  • 1
  • 13
  • 27
  • Question - Why can't you simply check each character individually? For example, run through all characters of your alphabet for the first character and when we find a match, then move to the second character, etc.? Why do you have to generate all possible 1 permutations, then 2 permutations, etc.? – rayryeng Feb 27 '15 at 20:16
  • @rayryeng I think `strcmp` is only to fake if password was found or not. – CitizenInsane Feb 27 '15 at 20:21
  • @CitizenInsane - I'm not quite sure I follow. Could you elaborate? – rayryeng Feb 27 '15 at 20:22
  • @rayryeng He need to find `realpass` but he has no a priori knowledge about it ... `strcmp` is just to simulate access granted/denied from `guess` he is using. – CitizenInsane Feb 27 '15 at 20:30
  • Just a thought, and there are better ways than using nested loops, but if you notice in your code for check passwords of length 2 you're including the code for passwords of length one. You could put an additional check each time you generate a new character in `guess(1)` like so: `strcmp(guess(1), realpass)` – beaker Feb 27 '15 at 20:34
  • I've been looking for a duplicate, but haven't found anything to my liking so far. But I haven't given up yet. – beaker Feb 27 '15 at 20:37
  • The closest I've found to a duplicate is http://stackoverflow.com/questions/4165859/ but it probably isn't a perfect solution - it will generate all the combinations in one massive array rather than generating them one at a time, so it will eat tons of memory that the other way wouldn't. – Katie Feb 27 '15 at 20:39
  • @Katie I found this one http://stackoverflow.com/questions/18591440/how-to-find-all-permutations-with-repetition-in-matlab but the only answer without using built-in functions (which I assume is what the OP is after) is only pseudocode. – beaker Feb 27 '15 at 20:59

1 Answers1

2

A solution is to work from a linear index and transform it to some coordinate in the alphabet using ind2sub.

Code being more clear than long explanation here is my brute force solution:

function [testpass] = BruteForcePassword(realpass)
%[
    if (nargin < 1), realpass = 'hello'; end

    maxPassLength = 8;
    alphabet = 'abcdefghijklmnopqrstuvwxyz';

    for l = 1:maxPassLength,

        % Number of possible password for this length
        combinationCount = length(alphabet)^l; 

        % Put all possible combination in some sort of kilometer counter
        % where each digit can run up to the number of elements in the alphabet
        coordinate = cell(1,l);
        size = length(alphabet) * ones(1,l);

        for index = 1:combinationCount,

            [coordinate{:}] = ind2sub(size, index); % transform linear index into coordinate
            testpass = cellfun(@(i)alphabet(i), coordinate); % build password from current coordinate

            % Test if password is ok
            fprintf('Now testing: %s\n', testpass);
            if (strcmp(testpass, realpass))
                break;
            end

        end

        if (strcmp(testpass, realpass))
             break;
        end

    end
%]
end

Some explanations

Lets first consider we have an alphabet of 3 characters only (just because it is simpler and it is not loosing generality).

When we test for pass of length == 1, we have to test for all 3 positions in the alphabet vector.

1
2
3

When we test for pass of length == 2, we have to fix some position in the alphabet for 1st character and test all 3 positions for the second one. Then we fix 1st character to another position in the alphabet and we test again all 3 positions for the second one. And last we fix again position for 1st character and we test again all 3 positions for 2nd one.

+----- Position in the alphabet for first character
|
| + -------- Position in the alphabet for second character
v v

1 1  
1 2
1 3

2 1 
2 2
2 3

3 1
3 2
3 3

So going back to an alphabet of any length it is like we are running all position in an array of size NxNxN...xN where N is the number of positions in the alphabet. The number of dimension of this array is equal to the length of password.

In the code I'm using ind2sub function to rebuild array coordinate from linear index position.

NB: Another way to say it, it's like if we're counting in base-N where N is the number of elements in the alphabet.

Edit

As the number of combinations can grow quickly (N^l) and can go beyond system number representation for correct linear indexing ... here is a link to updated code to run all coordinate positions without using linear indexing at all (i.e. not using ind2sub):

>> Code without linear indexing <<

CitizenInsane
  • 4,755
  • 1
  • 25
  • 56
  • That's the ticket. A short explanation of what it's doing would help though. Just a note that you're counting in base `numel(alphabet)` or something. – beaker Feb 27 '15 at 20:44
  • @beaker ... yes i was trying to find textual explanation ... but haven't found quite synthetic way to express it (not even in my native langage though!) ... i'll try to think of it and will edit my answer. – CitizenInsane Feb 27 '15 at 20:47
  • 1
    @CitizenInsane - Yeah I totally misinterpreted the question. I gave you a +1. – rayryeng Feb 27 '15 at 20:55
  • Thanks @rayryeng, it's late we all have to go to sleep :) – CitizenInsane Feb 27 '15 at 21:19
  • 1
    lol. I have no excuse! I'm 6 hours behind you. It's only the afternoon here! – rayryeng Feb 27 '15 at 21:19
  • 1
    @rayryeng Take care, now going to bed ... As a add-on I updated my answer without using ind2sub because number of combination (N^l) can go far beyong machine precision quickly ... just went back to old carry propagation in base-N :) – CitizenInsane Feb 27 '15 at 22:38
  • This solution works great! It's kinda funny how the notes on my scratch paper are actually pretty similar to your explanation of the script, so that actually makes a lot of sense to me. I just wasn't sure (until now) how to put it into code. Thanks so much! – wbadart Feb 28 '15 at 16:41