17

This code has been floating around the net for quite a few years - it's apparently able to provide a password for decrypting an excel spreadsheet that you don't know the password to.

http://www.theofficeexperts.com/VBASamples/Excel02.htm

I'm interested in how this works but I don't seem to be able to work it out. I'm assuming it's encrypted under a certain value that can be gotten to in a number of ways (a few places on the net say it'll give you the original password or another one that will work) sort of like a public key - you can have 100 public keys that all work with a single private one.

To me it seems like it's creating integer variables and filling them with a specific number, before turning that number into the relevant characters. Wouldn't these always be the same? If so, is there "master password" for Excel protection?

Thanks all!

EDIT: I've noticed the For n = 32 To 126 in the example code I posted. Cross referencing with an ASCII table that seems to be all characters from space through to tilde. Is this some sort of dictionary attack I'm not understanding due to syntax?

brettdj
  • 54,857
  • 16
  • 114
  • 177
Andrew White
  • 600
  • 1
  • 9
  • 29
  • 5
    password protection in the old-style office document formats uses the cryptographic equivalent of a wet tshirt: "covered, but everything's still visible". it was never intended to be "secure". if you want security, then the newer 'x' versions do use better crypto and are a lot harder to get into. – Marc B Oct 12 '12 at 04:10
  • @MarcB so true - love the comment – Adrian Cornish Oct 12 '12 at 04:11
  • Thanks Marc. I love the analogy and do understand it's not secure. However I'm more interested in the method that is used. – Andrew White Oct 12 '12 at 04:11
  • 1
    there's no master key in office files. the algorithm used is beyond trivial to break (e.g. elcomsoft's office password recovery app can get an old-style office password pretty much instantly). there's no dictionary, no hard work. just a bit of bit fiddling to add some extra water onto the tshirt to make it even more transparent. – Marc B Oct 12 '12 at 04:13
  • Does the code actually work? Have you tested it? – mkingston Oct 12 '12 at 04:35
  • I added "worksheet" to the question title to be specific to what this code does. There is no known way other than brute-force to attack the file open password. – brettdj Oct 12 '12 at 04:40
  • 1
    @MarcB Actually, the sheet unlock code works on Xl2010 as well. – brettdj Oct 12 '12 at 04:57
  • 1
    It's sad that VBA protection is not worth anything either, it's even faster to get round than worksheet protection – SWa Oct 12 '12 at 10:42

5 Answers5

22

It is a small world, as the code indicates I posted that code at another forum around 10 years ago having seen it somewhere else - I think on John Walkenbach's old web site

It is important to note that this code protection applies to worksheet protection only - not to Excel's file open or VBA passwords.

  • One example of this write-up in full is here (screenshot below)
  • googling excel sheet protection “test” and “zzyw” gives other references such as this from Tom Urtis

enter image description here

brettdj
  • 54,857
  • 16
  • 114
  • 177
  • 1
    Nice to have my guess somewhat validated :). A 16 bit hash is pretty cruisy. Suppose it's only sheet protection rather than workbook protection. Do you know if the routine posted by OP produces all distinct results from the hash function? I suppose I could test this, but I'm a spot on the lazy side. – mkingston Oct 12 '12 at 05:26
  • @brettdj, I'm confused that your explanation doesn't match the Microsoft documentation. Are you saying that when you originally wrote this code that the hash was stored as a 16-bit value coded as a series of A's and B's followed by an ASCII character, and that your code merely tries all hash values? – vy32 Oct 14 '12 at 14:58
  • @vy32 It isn't my code, I posted it to `Experts-Exchange` having seen it somewhere else. I am unclear as to what MSFT documentation you are referring to? – brettdj Oct 14 '12 at 22:15
11

Edit (2020): From Excel 2013 on, apparently the protection scheme has changed. So the original answer only has historical significance anymore.

The new protection makes it near-impossible to retrieve the password by using state-of-the-art SHA-512 hashing. But why break it if you can simply pluck it out within seconds.

  • unzip the .xlsx or .xlsm file
  • edit xl/worksheets/sheet<num>.xml
  • search and remove <sheetProtection... /> tag
  • save, zip again, enjoy

Original answer (up to Excel 2010)

Fascinating - I knew the code snippet before, but not the explanation that brettdj posted. As the others explained, it is a brute-force search for hash collisions. Actually it seems to have been made by trial and error, since it does much more work than necessary (194560 combinations are generated, but there are only 32768 hashvalues possible.)

Excel's hash algorithm in short (as explained in http://chicago.sourceforge.net/devel/docs/excel/encrypt.html):

  1. Take the ascii code of each character of the passwort.
  2. Treat it as a 16-bit signed number. Shift its bits to the left, based on the position of the character (1 bit for first character, 2 for 2nd and so on)
  3. XOR all the characters together, giving a 16-bit signed int >=0.
  4. XOR that result with the length of the password and a magic number.

Knowing this, one can devise a brute-force search as follows:

  • The highest bit is always zero, so there are 15 bits to test.
  • Split them up into three counters each covering 5 bits. That way each counter can represent a printable ascii char.
  • Pack the ascii representation of those counters in a password string, in a way so that they do not affect each other.

The simplest way is to use a 11-character password and put the counters at position 1, 6 and 11. The bit-shifting in step 2 aligns the counter bits the right way: the first counter ("x") is shifted 1 bit, the second one ("y") 6 bits, the third one ("z") 11 bits. In a bitwise representation of the hash, the counters affect the following bits:

bit: 76543210 76543210
cnt: -zzzzyyy yyxxxxxz

The XOR operations can be ignored since the XOR argument is constant all the time. For the same reason, a constant offset (e.g. 64) can be added. Also it does not matter what character is used on the other password bytes (2-5, 7-10).

By iterating over all possible combinations of x, y, z you eventually find a password that gives the same hash value as the original one.

Public Sub demo()
    ' http://stackoverflow.com/questions/12852095/how-does-excels-worksheet-password-protection-work
    Dim x As Integer, y as Integer, z as Integer
    Dim part1 As String, part12 As String
    Dim sh As Worksheet
    Set sh = ThisWorkbook.Worksheets(1)

    sh.Protect "$ome_Insanely_Long_and_c0mplex_password! [(which i$ imp*ssible t0 re-member)]"

    For x = 64 To 95
        ' pad with dots, so that x, y and z affect nonoverlapping bits of the hash.
        part1 = Chr(x) + "...."
        For y = 64 To 95
            part12 = part1 + Chr(y) + "...."
            For z = 64 To 95
                On Error Resume Next
                    sh.Unprotect part12 + Chr(z)
                    If Err.Number = 0 Then
                        Debug.Print "Password: '" & part12 + Chr(z) & "'"
                        Exit Sub
                    End If
                On Error GoTo 0
            Next
        Next
    Next
End Sub
Torben Klein
  • 2,943
  • 1
  • 19
  • 24
  • 2
    This code didn't work when I tried it. However, [this page](http://www.mcgimpsey.com/excel/removepwords.html) includes a procedure (modified by Norman Harker and JE McGimpsey) to unlock a protected sheet (and it works quite well). – ARich Nov 13 '13 at 14:22
  • 2
    This is the best explanation by far that I've seen, and technically the only correct one since it identifies the true number of possible "passwords" as 2^15 = 32768. Excel's hash function is a [Surjective Function](http://en.wikipedia.org/wiki/Surjective_function), mapping a large number of input passwords to a small number of passwords. Torben identifies a subset of the input passwords which is [Bijective](http://en.wikipedia.org/wiki/Bijection) to the set of possible hashes, creating an optimized version of the oft-copied brute force pass cracker. – Blackhawk Nov 13 '13 at 14:37
  • Thanks, this algorithm is far more efficient! @ARich I tracked the algorithm on mcgimpsey.com to a Norwegian named Einar Ståle Huse. A link to his excel add-in was posted [here][1] in 1998 to which someone replied with the actual code. [1]: https://groups.google.com/d/msg/microsoft.public.de.excel/CfA0P4OZzfY/_ki5GR0wMYsJ – xXhRQ8sD2L7Z Mar 06 '15 at 03:35
  • The unlocking using "unzip the .xlsx or .xlsm file" does not work for "file-level protection". In that case, the whole Excel file is encrypted and it has another format than ZIP. See differences between "file-level" or "workbook/worksheet-level" protection here: https://support.microsoft.com/en-us/office/protection-and-security-in-excel-be0b34db-8cb6-44dd-a673-0b3e3475ac2d – Jan Šotola Nov 02 '21 at 09:12
  • Correct. File-level protection is to my knowledge unbreakable except for brute-force methods. Kindly note the word "worksheet" in the question title :-) – Torben Klein Nov 02 '21 at 10:58
3

Just a guess, but it looks to me like Excel tests if the password is valid by running it through some sort of hash function which produces a pretty small range of results and compares it with the hashed value stored.

Presumedly what this function is doing is testing all those values until it finds one that works. Judging by the values used, the hash function produces 2^11*(126-31) different values, all of which can be produced by the values generated in this code.

My analysis assumes this routine works. I haven't tested it.

mkingston
  • 2,678
  • 16
  • 26
0

The code does a brute force search using the encryption passwords AAAAAAAAAAA(SPACE) through BBBBBBBBBBB(~) where (SPACE) is the space character (CHR(32)) and (~) is of course character 126. When the password is found it displays the password in a message box.

Of course, this means that it is only checking passwords that are precisely 12 characters long and that only consist of upper-case letters A (ASCII 65) and B (ASCII 66) followed by one of the printable ASCII characters. @mkingston is correct that it tests 2^11*(126-31) different values. But there is no hash function. I don't think that this will crack many spreadsheets. You would be better off using One of these programs from AccessData.

For more details on ActiveSheet.Protect and ActiveSheet.Unprotect, see http://msdn.microsoft.com/en-us/library/office/aa191957(v=office.10).aspx.

vy32
  • 28,461
  • 37
  • 122
  • 246
  • The last character is incremented first. AA..A & chr(32) through AA..A & chr(126). Then AA..B & chr(32) through AA..B & chr(126). – mkingston Oct 12 '12 at 04:29
  • Sorry, probably barely got the comment in before you did :) – mkingston Oct 12 '12 at 04:32
  • Thanks for that, that answers the question quite well! So each time it tries a new one it runs it against ActiveSheet.Unprotect which returns a boolean indicating success? Fascinating! In theory the code I linked could be modified to run through all possible characters and variable character lengths, right? How do these external programs work? Are they using some sort of API to access the Unprotect function? – Andrew White Oct 12 '12 at 04:37
  • 4
    @andrewwhite Actually this doesn't properly understand the applicaton of this code. As evidenced by *I don't think that this will crack many spreadsheets* – brettdj Oct 13 '12 at 10:59
  • @brettdj, it won't crack many spreadsheets because most spreadsheets don't have an encryption password that consists only of the letters A or B of precisely this length followed by one of the ASCII characters. For example, this code will not crack the password "PASSWORD". – vy32 Oct 13 '12 at 13:00
  • 3
    @VY32 Read the link I provided. – brettdj Oct 13 '12 at 23:00
-3
Sub FindPassword()
    'Breaks worksheet password protection.
    Dim i As Integer, j As Integer, k As Integer
    Dim l As Integer, m As Integer, n As Integer
    Dim i1 As Integer, i2 As Integer, i3 As Integer
    Dim i4 As Integer, i5 As Integer, i6 As Integer
    On Error Resume Next
    For i = 65 To 66: For j = 65 To 66: For k = 65 To 66
    For l = 65 To 66: For m = 65 To 66: For i1 = 65 To 66
    For i2 = 65 To 66: For i3 = 65 To 66: For i4 = 65 To 66
    For i5 = 65 To 66: For i6 = 65 To 66: For n = 32 To 126
    ActiveSheet.Unprotect Chr(i) & Chr(j) & Chr(k) & _
        Chr(l) & Chr(m) & Chr(i1) & Chr(i2) & Chr(i3) & _
        Chr(i4) & Chr(i5) & Chr(i6) & Chr(n)
    If ActiveSheet.ProtectContents = False Then
        MsgBox "One usable password is " & Chr(i) & Chr(j) & _
            Chr(k) & Chr(l) & Chr(m) & Chr(i1) & Chr(i2) & _
            Chr(i3) & Chr(i4) & Chr(i5) & Chr(i6) & Chr(n)
         Exit Sub
    End If
    Next: Next: Next: Next: Next: Next
    Next: Next: Next: Next: Next: Next
End Sub
TLama
  • 75,147
  • 17
  • 214
  • 392