2

I'm using libgcrypt to encrypt and decrypt files. When I take in the proper amount of bytes using fread, I need to pad it with 16-n bytes in order for it to properly be encrypted by gcry_cipher_encrypt. Upon decryption however, the null bytes/padding is still present. Is there any way to read and write in 16 byte blocks and still strip the padding at the end?

#include <stdio.h>
#include <gcrypt.h>

#define algo GCRY_CIPHER_AES128
#define mode GCRY_CIPHER_MODE_CBC
#define KEY_LENGTH 16
#define BLOCK_LENGTH 16

int main(){

    char IV[16];
    char *encBuffer = NULL;
    FILE *in, *out, *reopen;
    char *key = "A key goes here!";
    gcry_cipher_hd_t handle;
    int bufSize = 16, bytes;

    memset(IV, 0, 16);

    encBuffer = malloc(bufSize);

    in = fopen("in.txt", "r");
    out = fopen("out.txt", "w");

    gcry_cipher_open(&handle, algo, mode, 0);
    gcry_cipher_setkey(handle, key, KEY_LENGTH);
    gcry_cipher_setiv(handle, IV, BLOCK_LENGTH);

    while(1){
        bytes = fread(encBuffer, 1, bufSize, in);
        if (!bytes) break;
        while(bytes < bufSize)
            encBuffer[bytes++] = 0x0;
        gcry_cipher_encrypt(handle, encBuffer, bufSize, NULL, 0);
        bytes = fwrite(encBuffer, 1, bufSize, out);
    }

    gcry_cipher_close(handle);
    fclose(in);
    fclose(out);

    gcry_cipher_open(&handle, algo, mode, 0);
    gcry_cipher_setkey(handle, key, KEY_LENGTH);
    gcry_cipher_setiv(handle, IV, BLOCK_LENGTH);

    reopen = fopen("out.txt", "r");
    out = fopen("decoded.txt", "w");

    while(1){
        bytes = fread(encBuffer, 1, bufSize, reopen);
        if (!bytes) break;
        gcry_cipher_decrypt(handle, encBuffer, bufSize, NULL, 0);
        bytes = fwrite(encBuffer, 1, bufSize, out);
    }

    gcry_cipher_close(handle);

    free(encBuffer);

    return 0;
}
Chirality
  • 745
  • 8
  • 22
  • I'm no cryptography expert, but it looks like you're using AES, which is a block cipher, so it operates on blocks of bytes at a time (I suppose in this case 16). If you give it less than 16 bytes, it must be padding the block with 0s to get 16 bytes. Sounds like it is working correctly to me, why do you want to eliminate the null bytes at the end? – yano Mar 07 '16 at 04:15
  • Because say for instance you're operating on a text file where it's actually noticeable, text editors pop up with warnings and null bytes appear at the end. It's not so bad in pictures or other media files I'd imagine. – Chirality Mar 07 '16 at 04:15
  • I wouldn't loop on `while(1)`, I'd loop on `while(!FEOF(in))`. That will loop until the end of the file is reached. `bytes` will equal the number of bytes read by `fread`. I still think I'm missing something here. Encrypted data is a mess, as it should be. It is totally valid to have bytes of all values all through out encrypted data, including NULL bytes. If you're looking at encrypted data in a text editor, then you will see gobbledygook. If you're saying you encrypt, decrypt, and the decryption doesn't match the original then yes, there is a problem. – yano Mar 07 '16 at 04:25
  • 1
    That would be a good idea if while(!FEOF(in)) wasn't always a bad thing to do. See https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong Plain -> Encrypted (Mess) -> Decrypted (+ \00\00\00\00) – Chirality Mar 07 '16 at 04:29
  • The link you cite also explains how to do it correctly. – n. m. could be an AI Mar 07 '16 at 04:39
  • Note that `fread` returns the number of bytes actually read. You must *never* assume this is the same as the number of bytes you requested; *always* check it! Any bytes in your buffer beyond this number have not been filled in. – Nate Eldredge Mar 07 '16 at 04:39
  • So fread returns the proper number of bytes but fwrite writes the number requested? And @n.m. what was suggested wasn't a valid fix – Chirality Mar 07 '16 at 04:42
  • goodness, I'll have to take a closer look at that when I've got some more time, thanks for posting that. Ok, I think I see 'a' problem. You are feeding extra data to the encryption function. You're feeding it 16 bytes at a time in an array initialized with NULL bytes. Once the file gets to the end, if its size is not a perfect multiple of 16, then you're actually encrypting and then decrypting bytes that weren't originally in the file (big surprise, they're NULL bytes). You probably want to call `gcry_cipher_encrypt(handle, encBuffer, bytes, NULL, 0);` instead – yano Mar 07 '16 at 04:42
  • `fwrite` also is not guaranteed to write as many bytes as you requested. It returns the number actually written. – Nate Eldredge Mar 07 '16 at 04:59
  • I mean, how to write the while loop correctly, not how to pad data with zeros... – n. m. could be an AI Mar 07 '16 at 05:07
  • Yeah, that's why I referenced it. However, changing the loop from `while(1)` to `while(!feof(fp))` won't do anything in this case. The real question is how to strip the padding. – Chirality Mar 07 '16 at 05:10

5 Answers5

1

Block ciphers have many modes of operation. Some require input data length to be a mmultiple of block size, and thus essentially require plaintext padding; some don't. See more about this here.

If you must use a mode that requires padding, you must save the plaintext length along with the encrypted data. The simplest way is to write it in an additional block in the end (encrypt that block too!). There are other, more sophisticated schemes, that don't always require addimg a block; see this.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • oh ok, I think I see what you're saying. So really (s)he should write wrappers around `gcry_cipher_encrypt` and `gcry_cipher_decrypt` where you pass in data of arbitrary length and the wrappers add and extract the plaintext length from the last block,, and `decrypt_wrapper` will only write the plaintext data to file? – yano Mar 07 '16 at 05:55
  • @yano there are many ways to skin a cat... might jist use a streaming mode like CTR CFB or OFB instead of CBC. – n. m. could be an AI Mar 07 '16 at 06:24
  • sure,, I don't know enough about different modes of operation yet to really comment on that (I'm aware they exist, that is all). As a user of this library, however, I would not be satisfied; this seems like an incomplete implementation to me. I'm sure the authors have their reasons though,, maybe there's something to be said for having the freedom to implement your own padding scheme, which essentially is what this sounds like to me. – yano Mar 08 '16 at 05:35
  • @yano A crypto library provides standard crypto algorithms. Your satisfaction with standard algorithms, or lack thereof, is of little consequence. These algorithms deal with buffers composed of blocks of a specified size. Tacking anything on top of a standard algorithm is not welcome. Methods exist to deal with messages of arbitrary size but they are application-specific and have nothing to do with encrytion algorithms per se. – n. m. could be an AI Mar 08 '16 at 06:17
  • Fair enough, but I'll be sticking with OpenSSL. Or at least what I used way back when was a wrapper around that.. I wasn't dealing with block sizes or padding and had no trouble talking to a 3rd party device right out of the gate. If I recall that was CBC mode of operation. – yano Mar 08 '16 at 14:53
  • @yano openssl or rather libcrypto provides a high-level interface that optionally deals with block sizes and padding for you. In return you have to call the extra function EVP_EncryptFinal_ex() as the very last step of the encryption process. This will process the last partial block. – n. m. could be an AI Mar 08 '16 at 16:33
1

I was able to fix it by correctly storing the amount of padding and then checking for it later, suggested by zaph. I used PKCS#7 in order to determine how many bytes to write and of what type. I could've written NULL bytes, but there would've been no difference, so might as well stick to a standard.

#include <stdio.h>
#include <gcrypt.h>

#define algo GCRY_CIPHER_AES128
#define mode GCRY_CIPHER_MODE_CBC
#define KEY_LENGTH 16
#define BLOCK_LENGTH 16

int main(){

    char IV[16], *encBuffer = NULL;
    char *key = "A key goes here!";
    int bufSize = 16, bytes, i=0, padding;

    FILE *in, *out, *reopen;
    gcry_cipher_hd_t handle;

    memset(IV, 0, 16);
    encBuffer = malloc(bufSize);

    // Open in/out for reading and writing
    in = fopen("in.txt", "r");
    out = fopen("out.txt", "w");

    // Set handle for encryption
    gcry_cipher_open(&handle, algo, mode, 0);
    gcry_cipher_setkey(handle, key, KEY_LENGTH);
    gcry_cipher_setiv(handle, IV, BLOCK_LENGTH);

    // Read from in, write encrypted to out
    while(1){
        bytes = fread(encBuffer, 1, bufSize, in);
        if (!bytes) break;

        // If fread grabbed less than 16 bytes, that's our final line
        // Use the byte number for padding and pad N bytes of N
        if ( bytes < BLOCK_LENGTH ){ padding = 16-bytes; }

        while(bytes < bufSize)
            encBuffer[bytes++] = padding;
        gcry_cipher_encrypt(handle, encBuffer, bytes, NULL, 0);
        bytes = fwrite(encBuffer, 1, bufSize, out);
    }

    // Close handle and i/o files
    gcry_cipher_close(handle);
    fclose(in);
    fclose(out);

    // Set handle for decryption
    gcry_cipher_open(&handle, algo, mode, 0);
    gcry_cipher_setkey(handle, key, KEY_LENGTH);
    gcry_cipher_setiv(handle, IV, BLOCK_LENGTH);

    // Reopen outfile, open decoded file
    reopen = fopen("out.txt", "r");
    out = fopen("decoded.txt", "w");

    //Loop until EOF
    while(1){
        i=0;
        bytes = fread(encBuffer, 1, bufSize, reopen);
        if (!bytes) break;
        gcry_cipher_decrypt(handle, encBuffer, bufSize, NULL, 0);
        // Read each block and check for padding
        while ( i++ < BLOCK_LENGTH ){
            // If padding is found write 16-padding bytes
            if ( encBuffer[i] == padding ){
                bytes = fwrite(encBuffer, 1, (16-padding), out);
                return 0;
            }
        }
        // If padding isn't found, write the whole buffer
        bytes = fwrite(encBuffer, 1, bufSize, out);
    }

    // Close the handle and free the buffer
    gcry_cipher_close(handle);
    free(encBuffer);

    return 0;
}
Chirality
  • 745
  • 8
  • 22
  • Where are you saving the value of padding so you can read the file again later? What happens if the data happens to contain the padding value? – Jim Flood Mar 09 '16 at 06:40
  • Currently this is a solution for immediate encrypt/decrypt functions. If it were to be two separate processes where the padding is unknown to the decryption routine I'm not sure it would work. This wasn't meant to be production code, moreso just a test of the library. As for immediate padding and the data, do you mean what happens if the last byte is 3 and there is 3 padding left (... 3 3 3 3)? In that case, I'd assume this wouldn't work, but for the sake of this being tested on text files where ASCII characters in the 1-15 aren't being used, it works just fine. – Chirality Mar 09 '16 at 06:59
  • No I meant, if the file happened to be a multiple of 16, so you would not add any padding, and it just happened to end 3 3 3. When you read the file, this would look like it was padded with three bytes. If you had padded with 16, then the end of the file would be 3 3 3 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 16 and it is clear what is padding and what is data. – Jim Flood Mar 09 '16 at 19:01
  • If it was a multiple of 16, encBuffer wouldn't need any padding, therefore, encBuffer[i] never equals padding. The buffer is written out perfectly fine. I see what you're saying but it didn't apply in this case. This was strictly text files. – Chirality Mar 09 '16 at 19:14
  • Right, I understand. But in the general case, you would pad 16 bytes otherwise you would not know on the other end whether or not there is padding. – Jim Flood Mar 09 '16 at 19:16
  • I just tried reading/writing a file that had 3 \03 bytes at the end and has a total byte count that is multiple of 16. It worked perfectly fine. Any other cases you want me to test? – Chirality Mar 09 '16 at 19:30
0

cry_cipher does not seem to make any mention of padding but padding is necessary if the data to be encrypted is not always a multiple of the block size. Kind of a Bozo move not to mention padding, perhaps it is always PKCS#7 padding.

If there is padding the usual padding is PKCS#7 (PKCS#5 is essential the same thing). PKCS#7 always adds at least one to the block size bytes of padding and the padding value is the number or padding bytes. On decryption the padding is removed.

This means that if the input data to be encrypted is an exact multiple of the block size (16-bytes for AES) a block of padding will be added. SO with 16-bytes in the encrypted output will be 32-bytes.

zaph
  • 111,848
  • 21
  • 189
  • 228
0

This is in my comment, but I suspect you need to call gcry_cipher_encrypt(handle, encBuffer, bytes, NULL, 0); instead. fread returns the number of bytes it read from the file. If you constantly encrypt with bufSize bytes, then when you get to the end of the file in the loop, there are going to be extra bytes fed to the encryption function that were not part of the file, unless the file is a perfect multiple of 16 bytes. The encryption library doesn't know what's file data and what isn't, and it's correctly encrypting and decrypting the extra NULLs at the end that are in your encBuffer buffer. My guess is the user shouldn't have to worry about padding their own data, the library you're using should take care of that under the hood.

A quick check would be to initialize encBuffer with 0x20 and then see if your decrypted file has extra spaces at the end of it

yano
  • 4,827
  • 2
  • 23
  • 35
  • I saw. Fread reads the proper number of bytes but the encryption scheme *needs* 16 bytes to encrypt the block, so calling with bytes vs bufSize won't make a difference. I can pad it with 0x80s and they show. – Chirality Mar 07 '16 at 04:56
  • then IMO that is a lousy interface. Hell, you shouldn't even need to pass the data in blocks. You should say, "here's my data, it's x bytes long, now go encrypt it". If you have to pad your data yourself, how is the library supposed to know what's real data and what's padding? `Decrypt(Encrypt(message))` should == `message` every single time, byte for byte. However AES or any other cipher does padding is irrelevant. – yano Mar 07 '16 at 05:01
  • @user10984587 I still maintain that all you should pass to the encryption function is your actual content data, and the size of that data. If it's less than 16 bytes, that's fine, that should get handled under the hood. I've done some encryption with OpenSSL before and at no point was I padding my content data. If you tried my suggestion and it didn't work, sorry, I surrender. – yano Mar 07 '16 at 05:09
  • I've worked with OpenSSL as well, but libgcrypt is a whole new beast. If you pass anything under the block_size to the encryption, it'll yell at you or spit out random data. Assuming no padding as if it was all handled elsewhere, it is entirely garbled. Assuming I handled padding and still pass "bytes" in place of bufSize to encrypt, I'm back where I started with the proper text and X amount of null bytes left over from padding. – Chirality Mar 07 '16 at 05:12
  • @user10984587 Well I'm damn curious now, I'd like to experiment with this. If all you say is true then it simply doesn't work. What's the selling point here? "We can encrypt and decrypt any file whose size is a multiple of 16 that you've got!" Does it use a special padding character? What if you want to encrypt a file full of 0x80s? Good luck, but if possible I'd be moving on to another encryption library. – yano Mar 07 '16 at 05:19
  • @user10984587 Are you maybe calling a function that's "too far down" in their sauce? Maybe `gcry_cipher_encrypt` isn't meant to be a user interface function, and it's a bug that they don't have it hidden? Maybe there's a higher level user interface function you're supposed to use instead? Sorry, just brainstorming now. If you don't have this answered by tomorrow I would like to fiddle with this library. – yano Mar 07 '16 at 05:26
  • Already thought about looking for other libraries. The only padding is what I'm doing (the section `encBuffer[bytes++] = 0x0` for nulls). It will let me pad anything I please. I can encrypt a file full of 0x80s and they turn into `Â` with my key. Could be a bug, not sure. As for if this is too far down, it shouldn't be. I read entirely through the gcrypt documentation and this is the only way they have to encrypt/decrypt (besides different algorithms/modes, hashes, etc.) – Chirality Mar 07 '16 at 05:32
  • @user10984587 yeah I just read through the documentation. This answer is wrong, and `gcry_cipher_encrypt` is the user interface function. Hmmmm.... Any of the `gcry_` functions returning errors? – yano Mar 07 '16 at 05:42
  • Unfortunately no, I just added err checks to all of the gcry functions in the program and none of them came back positive. – Chirality Mar 07 '16 at 05:51
0

When you are writing the file and come to the last block, then just as your answer is doing, pad it out to a full 16 bytes. What you are missing is that if the file happens to end on a full 16 byte block, then add another full block of padding -- the padding size is 16 -- 16 bytes of the value 16 in a row.

Now when you read the file, just read to the end. The last block will be a full 16 bytes. Just look at the very last byte. Whatever value that last byte is, that is your padding. Remove that many bytes from the end of the last block. If the value is 16 remove the entire last block.

If you don't add a full 16-byte pad, you will never know if those last three bytes of 3, 3, 3 at the end of the file were padding, or, important data. It will look exactly the same.

Jim Flood
  • 8,144
  • 3
  • 36
  • 48