4

I've been able to encode a std::vector<char> to Base64 using boost and the following code:

using namespace boost::archive::iterators;

std::string message(binary.begin(), binary.end());
std::stringstream os;
using base64_text = insert_linebreaks<base64_from_binary<transform_width<const char *, 6, 8>>, 72>;

std::copy(
    base64_text(message.c_str()),
    base64_text(message.c_str() + message.size()),
    ostream_iterator<char>(os)
);

return os.str();

I found this on Stackoverflow. Well, now I want to to the same thing backwards, putting in a Base64 formatted std::string and end up with a std::vector<char>. But I can't adapt my example to do the thing in reverse. I found some other code online, which works nice with a Hello World example, but when there's an actual bigger Base64 which also contains some critical characters like backslashes, the whole thing crashes.

This is what I'm doing now to decode:

using namespace std;
using namespace boost::archive::iterators;

typedef
    transform_width<
        binary_from_base64<string::const_iterator>, 8, 6
        > binary_t;
string dec(binary_t(str.begin()), binary_t(str.end()));
return dec;

It crashes in the last line before the return, when I'm about to create the string. Do you see what's wrong with it?

DenverCoder21
  • 879
  • 3
  • 16
  • 34
  • What's a backslash doing in a base64 encoded string? | "the whole thing crashes" -- Details? Is it throwing an exception due to invalid input and you're failing to catch it? – Dan Mašek Sep 21 '17 at 18:11
  • According to wikipedia it's the character with the value 63. – DenverCoder21 Sep 21 '17 at 18:28
  • In that case you've got your slashes mixed up: / is forward, \ is back. – Dan Mašek Sep 21 '17 at 18:32
  • Ah of course, my bad. There are no backslashes in my base64 string, just ordinary forward ones. – DenverCoder21 Sep 21 '17 at 18:34
  • Add a try/catch or run it in a debugger and see what you can find out about the reason for a crash. It would be also useful if you could provide a [mcve], including a sample input which causes the error. – Dan Mašek Sep 21 '17 at 18:35
  • I'll see if I can find some data that crashes and I'm allowed to post here. Converting a "Hello world" back and forth works fine. – DenverCoder21 Sep 21 '17 at 18:41
  • Please try your code with strings of 1,2,3 bytes length - and report the result. I saw boost had a problem with strings that length is not multiplier of 3... – Artemy Vysotsky Sep 21 '17 at 18:52

3 Answers3

7

base64 requires both input and output to be padded into multiples of 3 and 4 respectively.

Here's a function for decoding base64 using boost:

#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/binary_from_base64.hpp>
#include <boost/archive/iterators/transform_width.hpp>
#include <boost/archive/iterators/insert_linebreaks.hpp>
#include <boost/archive/iterators/remove_whitespace.hpp>
#include <algorithm>    

std::string decode(std::string input)
{
  using namespace boost::archive::iterators;
  typedef transform_width<binary_from_base64<remove_whitespace
      <std::string::const_iterator> >, 8, 6> ItBinaryT;

  try
  {
    // If the input isn't a multiple of 4, pad with =
    size_t num_pad_chars((4 - input.size() % 4) % 4);
    input.append(num_pad_chars, '=');

    size_t pad_chars(std::count(input.begin(), input.end(), '='));
    std::replace(input.begin(), input.end(), '=', 'A');
    std::string output(ItBinaryT(input.begin()), ItBinaryT(input.end()));
    output.erase(output.end() - pad_chars, output.end());
    return output;
  }
  catch (std::exception const&)
  {
    return std::string("");
  }
}

It was taken from here, where an encoding function with padding using boost can also be found.

kenba
  • 4,303
  • 1
  • 23
  • 40
2

I tried kenba's accepted solution and ran into a few problems I've fixed below. First, trailing whitespace will cause any remove_whitespace iterator to skip past end(), causing a memory fault. Second, because you are calculating padding on the unfiltered string, any base64 encoded string which has whitespace in it will yield the incorrect number of pad_chars. The solution is to pre-filter whitespace before doing anything else with the string.

#include <boost/archive/iterators/base64_from_binary.hpp>
#include <boost/archive/iterators/binary_from_base64.hpp>
#include <boost/archive/iterators/transform_width.hpp>
#include <boost/archive/iterators/insert_linebreaks.hpp>
#include <boost/archive/iterators/remove_whitespace.hpp>
#include <algorithm> 

inline void decode_base64( std::string input, std::vector<char>& output )
{
    using namespace boost::archive::iterators;
    typedef remove_whitespace<std::string::const_iterator> StripIt;
    typedef transform_width<binary_from_base64<std::string::const_iterator>, 8, 6> ItBinaryT;
    try
    {
        /// Trailing whitespace makes remove_whitespace barf because the iterator never == end().
        while (!input.empty() && std::isspace( input.back() )) { input.pop_back(); }
        input.swap( std::string( StripIt( input.begin() ), StripIt( input.end() ) ) );
        /// If the input isn't a multiple of 4, pad with =
        input.append( (4 - input.size() % 4) % 4, '=' );
        size_t pad_chars( std::count( input.end() - 4, input.end(), '=' ) );
        std::replace( input.end() - 4, input.end(), '=', 'A' );
        output.clear();
        output.reserve( input.size() * 1.3334 );
        output.assign( ItBinaryT( input.begin() ), ItBinaryT( input.end() ) );
        output.erase( output.end() - (pad_chars < 2 ? pad_chars : 2), output.end() );
    }
    catch (std::exception const&)
    {
        output.clear();
    }
}
cowtung
  • 91
  • 8
1

Here is the quote from the 1_65_1/boost/archive/iterators/transform_width.hpp

// iterator which takes elements of x bits and returns elements of y bits.
// used to change streams of 8 bit characters into streams of 6 bit characters.
// and vice-versa for implementing base64 encodeing/decoding. Be very careful
// when using and end iterator.  end is only reliable detected when the input
// stream length is some common multiple of x and y.  E.G. Base64 6 bit
// character and 8 bit bytes. Lowest common multiple is 24 => 4 6 bit characters
// or 3 8 bit characters

So it looks like you have to pad your strings for encoding (for example with zero-chars) to avoid such problems, and truncate them after decoding

Artemy Vysotsky
  • 2,694
  • 11
  • 20