Edits and Updates
3/24/2013:
My output hash from Python is now matching the hash from c++ after converting to utf-16 and stoping before hitting any 'e' or 'm' bytes. However the decrypted results do not match. I know that my SHA1 hash is 20 bytes = 160 bits and RC4 keys can vary in length from 40 to 2048 bits so perhaps there is some default salting going on in WinCrypt that I will need to mimic. CryptGetKeyParam KP_LENGTH or KP_SALT
3/24/2013:
CryptGetKeyParam KP_LENGTH is telling me that my key ength is 128bits. I'm feeding it a 160 bit hash. So perhaps it's just discarding the last 32 bits...or 4 bytes. Testing now.
3/24/2013: Yep, that was it. If I discard the last 4 bytes of my SHA1 hash in python...I get the same decryption results.
Quick Info:
I have a c++ program to decrypt a datablock. It uses the Windows Crytographic Service Provider so it only works on Windows. I would like it to work with other platforms.
Method Overview:
In Windows Crypto API An ASCII encode password of bytes is converted to a wide character representation and then hashed with SHA1 to make a key for an RC4 stream cipher.
In Python PyCrypto An ASCII encoded byte string is decoded to a python string. It is truncated based on empircally obsesrved bytes which cause mbctowcs to stop converting in c++. This truncated string is then enocoded in utf-16, effectively padding it with 0x00 bytes between the characters. This new truncated, padded byte string is passed to a SHA1 hash and the first 128 bits of the digest are passed to a PyCrypto RC4 object.
Problem [SOLVED]
I can't seem to get the same results with Python 3.x w/ PyCrypto
C++ Code Skeleton:
HCRYPTPROV hProv = 0x00;
HCRYPTHASH hHash = 0x00;
HCRYPTKEY hKey = 0x00;
wchar_t sBuf[256] = {0};
CryptAcquireContextW(&hProv, L"FileContainer", L"Microsoft Enhanced RSA and AES Cryptographic Provider", 0x18u, 0);
CryptCreateHash(hProv, 0x8004u, 0, 0, &hHash);
//0x8004u is SHA1 flag
int len = mbstowcs(sBuf, iRec->desc, sizeof(sBuf));
//iRec is my "Record" class
//iRec->desc is 33 bytes within header of my encrypted file
//this will be used to create the hash key. (So this is the password)
CryptHashData(hHash, (const BYTE*)sBuf, len, 0);
CryptDeriveKey(hProv, 0x6801, hHash, 0, &hKey);
DWORD dataLen = iRec->compLen;
//iRec->compLen is the length of encrypted datablock
//it's also compressed that's why it's called compLen
CryptDecrypt(hKey, 0, 0, 0, (BYTE*)iRec->decrypt, &dataLen);
// iRec is my record that i'm decrypting
// iRec->decrypt is where I store the decrypted data
//&dataLen is how long the encrypted data block is.
//I get this from file header info
Python Code Skeleton:
from Crypto.Cipher import ARC4
from Crypto.Hash import SHA
#this is the Decipher method from my record class
def Decipher(self):
#get string representation of 33byte password
key_string= self.desc.decode('ASCII')
#so far, these characters fail, possibly others but
#for now I will make it a list
stop_chars = ['e','m']
#slice off anything beyond where mbstowcs will stop
for char in stop_chars:
wc_stop = key_string.find(char)
if wc_stop != -1:
#slice operation
key_string = key_string[:wc_stop]
#make "wide character"
#this is equivalent to padding bytes with 0x00
#Slice off the two byte "Byte Order Mark" 0xff 0xfe
wc_byte_string = key_string.encode('utf-16')[2:]
#slice off the trailing 0x00
wc_byte_string = wc_byte_string[:len(wc_byte_string)-1]
#hash the "wchar" byte string
#this is the equivalent to sBuf in c++ code above
#as determined by writing sBuf to file in tests
my_key = SHA.new(wc_byte_string).digest()
#create a PyCrypto cipher object
RC4_Cipher = ARC4.new(my_key[:16])
#store the decrypted data..these results NOW MATCH
self.decrypt = RC4_Cipher.decrypt(self.datablock)
Suspected [EDIT: Confirmed] Causes
1. mbstowcs conversion of the password resulted in the "original data" being fed to the SHA1 hash was not the same in python and c++. mbstowcs was stopping conversion at 0x65 and 0x6D bytes. Original data ended with a wide_char encoding of only part of the original 33 byte password.
- RC4 can have variable length keys. In the Enhanced Win Crypt Sevice provider, the default length is 128 bits. Leaving the key length unspecified was taking the first 128 bits of the 160 bit SHA1 digest of the "original data"
How I investigated edit: based on my own experimenting and the suggestions of @RolandSmith, I now know that one of my problems was mbctowcs behaving in a way I wasn't expecting. It seems to stop writing to sBuf on "e" (0x65) and "m"(0x6d) (probably others). So the passoword "Monkey" in my description (Ascii encoded bytes), would look like "M o n k" in sBuf because mbstowcs stopped at the e, and placed 0x00 between the bytes based on the 2 byte wchar typedef on my system. I found this by writing the results of the conversion to a text file.
BYTE pbHash[256]; //buffer we will store the hash digest in
DWORD dwHashLen; //store the length of the hash
DWORD dwCount;
dwCount = sizeof(DWORD); //how big is a dword on this system?
//see above "len" is the return value from mbstowcs that tells how
//many multibyte characters were converted from the original
//iRec->desc an placed into sBuf. In some cases it's 3, 7, 9
//and always seems to stop on "e" or "m"
fstream outFile4("C:/desc_mbstowcs.txt", ios::out | ios::trunc | ios::binary);
outFile4.write((const CHAR*)sBuf, int(len));
outFile4.close();
//now get the hash size from CryptGetHashParam
//an get the acutal hash from the hash object hHash
//write it to a file.
if(CryptGetHashParam(hHash, HP_HASHSIZE, (BYTE *)&dwHashLen, &dwCount, 0)) {
if(CryptGetHashParam(hHash, 0x0002, pbHash, &dwHashLen,0)){
fstream outFile3("C:/test_hash.txt", ios::out | ios::trunc | ios::binary);
outFile3.write((const CHAR*)pbHash, int(dwHashLen));
outFile3.close();
}
}
References:
wide characters cause problems depending on environment definition
Difference in Windows Cryptography Service between VC++ 6.0 and VS 2008
convert a utf-8 to utf-16 string
Python - converting wide-char strings from a binary file to Python unicode strings
PyCrypto RC4 example
https://www.dlitz.net/software/pycrypto/api/current/Crypto.Cipher.ARC4-module.html
http://msdn.microsoft.com/en-us/library/windows/desktop/aa379916(v=vs.85).aspx
http://msdn.microsoft.com/en-us/library/windows/desktop/aa375599(v=vs.85).aspx