2

I have seen many examples of how to implement Base64 encoders. But none of them are using struct inside of a union to accomplish the translation from three 8-bit blocks to four 6-bit blocks. And I have wondered why no one uses this method, because for me it looks like a easy and fast method.

I wrote an example in of the union-struct.

namespace Base64
{
    typedef union
    {
        struct
        {
            uint32_t b2     : 0x08;
            uint32_t b1     : 0x08;
            uint32_t b0     : 0x08;
            uint32_t pad    : 0x08;
        } decoded;
        struct
        {
            uint32_t b3     : 0x06;
            uint32_t b2     : 0x06;
            uint32_t b1     : 0x06;
            uint32_t b0     : 0x06;
            uint32_t pad    : 0x08;
        } encoded;
        uint32_t raw;
    } base64c_t;
}

I have tested to translate 0xFC0FC0 or in binary 111111000000111111000000 into four 6-bits block with this method, and it seems to work.

Base64::base64c_t b64;

b64.decoded.b0  = 0xFC;
b64.decoded.b1  = 0x0F;
b64.decoded.b2  = 0xC0;

std::cout.fill ( '0' );

std::cout << "0x" << std::hex << std::setw ( 2 ) << b64.encoded.b0 << std::endl;
std::cout << "0x" << std::hex << std::setw ( 2 ) << b64.encoded.b1 << std::endl;
std::cout << "0x" << std::hex << std::setw ( 2 ) << b64.encoded.b2 << std::endl;
std::cout << "0x" << std::hex << std::setw ( 2 ) << b64.encoded.b3 << std::endl;

Output:

0x3f
0x00
0x3f
0x00

Is there a downside with this way to translate 8-bit blocks to 6-bit blocks? Or haven't anyone thought about this way earlier?

timrau
  • 22,578
  • 4
  • 51
  • 64
BufferOverflow
  • 543
  • 3
  • 12
  • Could have something to do with type punning with unions being undefined behaviour. This probably works, but isn't guaranteed by the C++ standard. – user4581301 Oct 14 '15 at 22:34

1 Answers1

2

The order in which bitfields are packed within a struct is implementation-defined. Therefore, although you get the correct base64 result on your machine, you may get a totally different (wrong!) answer when you port this code to a different architecture or compiler (e.g. big-endian PowerPC). To borrow from this answer:

Unspecified behavior

  • The alignment of the addressable storage unit allocated to hold a bit-field (6.7.2.1).

Implementation-defined behavior

  • Whether a bit-field can straddle a storage-unit boundary (6.7.2.1).
  • The order of allocation of bit-fields within a unit (6.7.2.1).

You are therefore better off using bit-shifting code (which is what basically every base64 implementation uses), since that will be guaranteed to be the same across platforms.

Community
  • 1
  • 1
nneonneo
  • 171,345
  • 36
  • 312
  • 383
  • So technically it is a bad idea to use struct with bitfields inside a union in all cases? – BufferOverflow Oct 14 '15 at 22:46
  • 2
    There's some limited uses (e.g. saving memory with many small objects, which can be critical on embedded hardware), but in general I would avoid their use. (I have definitely seen some truly crazy C code which uses bitfields to parse external input, but that's incredibly fragile and nonportable!) – nneonneo Oct 14 '15 at 22:49
  • 2
    You'll also see bitfield hacks come up often with memory-mapped or register-mapped I/O in very low-level systems programming; often the registers or memory layouts are specific to that hardware anyway so there isn't the same portability concern. (However, I personally still try to avoid those hacks and just go with bit-shift operators anyway - the compiler generates the same code in either case) – nneonneo Oct 14 '15 at 22:51
  • Sits in the last resort pool. If you need (and get) that extra few microseconds and there is nothing else to cut, you may find yourself doing some pretty off-the-wall, unsafe things and praying it doesn't bite you later. Measure and prove, then test the stuffing out of it, then leave comments that are pornographically explicit of what you've done and why and why saner solutions didn't work. – user4581301 Oct 14 '15 at 23:01
  • user4581301: No I am not hunting microseconds. I thought about this idea for long time since, but I have never used it. And yesterday I got a cryptography challenge/contest that is split in four parts, where first part is to implement base64 (de/en)coder in C++. So I just had to ask why I never have seen something like this. But I understand now that i should not use this method. – BufferOverflow Oct 14 '15 at 23:11
  • ...this challenge wouldn't happen to be the NSA codebreaker challenge, would it? – nneonneo Oct 14 '15 at 23:22