Base64 encoding in c - where is this one stray "X" coming from?

Question

I'm using Ryyst's code from here - How do I base64 encode (decode) in C? - to base64 encode an image file and insert it into a HTML document.

It works! - except on the second line of base64-encoded output there is a single stray "X" at the end of the line.

It's always the second line, and only the second line, no matter how large the binary file (I've tried many).

If I remove the stray "X" manually, the encoded data exactly matches the output of the base64 utility, and the image is correctly decoded by the browser.

I've tried adding "\0" to the ends of each char array to make sure they are properly terminated (made no difference). I've checked that "buffer" is always 60 bytes, and that output_length is always 80 bytes (they are). I've read and re-read Ryyst's code to see if anything there could cause it (didn't see anything, but I am a C n00b). I did a rain dance. I searched for a virgin to toss down a volcano (can't find either one around here). The bug is still there.

Here are the important bits of the code -

while (cgiFormFileRead(CoverImageFile, buffer, BUFFERLEN, &got) ==cgiFormSuccess)
{
  if(got>0)
  {
    fputs(base64_encode(buffer, got, &output_length), targetfile);
    fputs("\n", targetfile);
  }
}

And the base64_encode function is -

char *base64_encode(const unsigned char *data, size_t input_length,
    size_t *output_length)
{

  *output_length = 4 * ((input_length + 2) / 3);

  char *encoded_data = malloc(*output_length);
  if (encoded_data == NULL)
    return NULL;
  int i = 0, j = 0;
  for (i = 0, j = 0; i < input_length;)
  {

    uint32_t octet_a = i < input_length ? data[i++] : 0;
    uint32_t octet_b = i < input_length ? data[i++] : 0;
    uint32_t octet_c = i < input_length ? data[i++] : 0;

    uint32_t triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

    encoded_data[j++] = encoding_table[(triple >> 3 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 2 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 1 * 6) & 0x3F];
    encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];
  }

  for (i = 0; i < mod_table[input_length % 3]; i++)
    encoded_data[*output_length - 1 - i] = '=';

  return encoded_data;
}

(as you can see, I'm also using the cgic library v 205, but I don't think the problem is from there because its giving the right number of bytes) (And BUFFERLEN is a constant, equals 60.)

What am I doing wrong, guys?

(Even more frustratingly, I /did/ get Ryyst's algorithm to work flawlessly once before, so his code /does/ work.)

I'm compiling using gcc on an ARM-based Debian Linux system, if that makes any difference.

parkydr · Answer 1 · 2013-06-08T14:27:22.010

1

Comparing your function with the original you've deleted:

encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];

Apart from that, the function is the same, I'm guessing that's just a copy error.

The problem is you are using BUFFERLEN rather than looking at got, which returns the amount of data read, the second line doesn't read the full 60 characters so you are encoding whatever junk is at the end of the buffer.

edited Jun 08 '13 at 14:27

answered Jun 08 '13 at 14:04

parkydr

7,596
3
32
42

Thanks parkydr! Updated my code as you recommended (typo copying from terminal as you guessed, and good point on "got"). But that "X" is still appearing in exactly the same place >>:-[ I will trying forcing NULL termination again, see if it helps now... – Raw Jun 10 '13 at 12:34
You will need to terminate the output, the function doesn't, it just gives you the length so you'll need buffer[output_length] = '\0'. Have you tried printing got to see how much you are actually reading? – parkydr Jun 10 '13 at 17:47

Base64 encoding in c - where is this one stray "X" coming from?

1 Answers1