2

So I'm working on an encryption/decryption method in C++ right now. It takes an std::string as an input (plus a "key" which will be used for encrypting the message), and it produces an std::string output which represents the encrypted string.

During the encryption process, I convert the std::string to an array of uint16_t and do some calculations on that as a part of the encryption. The reason for that is simply because a uint_16_t value gives much more headroom to encrypt the original value via an algorithm then a char does.

The problem is that in order to give back the encrypted message as an std::string I need to somehow convert the array of uint_16_t values to something readable (that is something that fits inside a char array without overflow). For that, I thought I could use base64 but all the base64 implementations I found only take std::string or char* as an input (8 bits/element). Obviously if I would provide it with my uint16_t array, I would never be able to get my original values back because the base64 function casts it down to 8 bits before converting it.

So here's my question: does anyone know a method of encoding a uint16_t array into a printable string (like base64), and back without any loss of data?

I know that I have to obtain the bytes of my data in order to use base64 but I'm not sure how to do that.

Thanks for any help in advance!

notadam
  • 2,754
  • 2
  • 19
  • 35
  • 3
    Hint: an array of `N` 16-bit values also happens to be an array of `2*N` 8-bit values. – Igor Tandetnik Jul 28 '14 at 22:35
  • "The reason for that is simply because a uint_16_t value gives much more headroom to encrypt the original value via an algorithm then a char does." That makes no sense. Pretty much every modern encryption algorithm is defined on a sequence of bytes. – CodesInChaos Jan 28 '15 at 11:15

4 Answers4

3

You can use base-n mini-library, which provides generic, iterator-based I/O.

The following code outputs "1 2 3 4 65535" to stdout, as expected:

uint16_t arr[] { 1, 2, 3, 4, 65535 };
const int len = sizeof(arr)/sizeof(arr[0]);
std::string encoded;
bn::encode_b64(arr, arr + len, std::back_inserter(encoded));
uint16_t out[len] { 0 };
bn::decode_b64(encoded.begin(), encoded.end(), out);
for (auto c : out) {
    std::cout << c << " ";
}

Mandatory disclosure: I'm the lib's author

azawadzki
  • 211
  • 2
  • 2
  • This sounds really nice! But since this will be a part of an encryption/decryption method, a more complicated solution adds more security, so I'm going with that byte-splitting method I posted. This library is cool though, I might use it some time. Thanks! – notadam Aug 04 '14 at 07:15
  • 2
    Regardless of the library you use, I believe that "more complicated solution adds more security" statement would be perceived by many as controversial. You might want to check what people think about "security through obscurity." – azawadzki Aug 04 '14 at 15:33
  • Well if someone tries to decode the encrypted data, simply running it through a base-64 decoder isnt that big of a deal for them, therefore not that secure. There are multiple layers of encryption in my method, so this wouldnt reveal the data itself, but it certainly takes more effort for the hacker to figure out this byte-combination thing before continuing to crack the encryption. I hope so at least. – notadam Aug 04 '14 at 16:49
  • 2
    It is typically advised to follow a well-established encryption protocol rather than implement one ad-hoc. I don't believe that one iteration of byte switching adds much in terms of resilience to a serious automated cryptoanalysis. I'm assuming you are not aiming at a mission-critical level of security, so I won't bore you any more ;) – azawadzki Aug 04 '14 at 20:51
  • I only wrote this encryption method for fun, nothing more. If I was designing something that other people would use, I would use something like RSA for sure. – notadam Aug 05 '14 at 06:20
1

So I finally solved it. I'm posting it in case someone else needs stuff like this. Basically I split the uint16_t values into two uint8_t each, and since those are 8-bit values, they can be used with any base64 implementation out there. Here's my method:

#include <iostream>
using namespace std;

#define BYTE_T uint8_t
#define TWOBYTE_T uint16_t
#define LOWBYTE(x)          ((BYTE_T)x)
#define HIGHBYTE(x)         ((TWOBYTE_T)x >> 0x8)
#define BYTE_COMBINE(h, l)  (((BYTE_T)h << 0x8) + (BYTE_T)l)

int main() {

    // an array with 16-bit integers
    uint16_t values[5] = {1, 2, 3, 4, 65535};

    // split the 16-bit integers into an array of 8-bit ones
    uint8_t split_values[10]; // notice that you need an array twice as big (16/8 = 2)
    int val_count = 0;
    for (int i=0; i<10; i+=2) {
        split_values[i] = HIGHBYTE(values[val_count]);
        split_values[i+1] = LOWBYTE(values[val_count]);
        val_count++;
    }

    // base64 encode the 8-bit values, then decode them back
    // or do whatever you want with them that requires 8-bit numbers

    // then reunite the 8-bit integers to the original array of 16-bit ones
    uint16_t restored[5];
    int rest_count = 0;
    for (int i=0; i<10; i+=2) {
        restored[rest_count] = BYTE_COMBINE(split_values[i], split_values[i+1]);
        rest_count++;
    }

    for (const auto &i : restored) cout << i << " ";
    cout << endl;

    return 0;
}

Of course the same method would work with any lengths. You just need to alter the bit shifting the the for loops. This code can be easily modified to split 32-bit ints to 16-bit ones, or whatever really.

notadam
  • 2,754
  • 2
  • 19
  • 35
0

Assuming that the uint16_t values range from zero to 63 and that you're using ASCII, just add 0x21 (hex 21) to each value and output it. This will create a printable string, but for display purposed you may also want to print a new line after some number of characters instead of having one very long string being displayed. Any decoder will have to subtract 0x21 from each character read from a file (and if there are newlines in the file ignore those (do this check before subtracting 0x21)).

President James K. Polk
  • 40,516
  • 21
  • 95
  • 125
rcgldr
  • 27,407
  • 3
  • 36
  • 61
  • Actually that's not the case. The `uint_16_t` values range from 1 to `UINT16_MAX` (which is 65535). They do represent ASCII characters though, but they are encoded. The algorithm that encrypts the characters takes a `char` and outputs a much larger number, a `uint_16_t` which represents that character in an encoded form. – notadam Jul 28 '14 at 22:44
  • Anyway, I think I could just split the 16-bit values into two 8-bit values each, then I could use any base64 function. I'll give it a shot. – notadam Jul 28 '14 at 22:45
  • Some 8 bit values could cause problems. You could "unpack" an array of 16 bit values into 6 bit values (use zero bits for padding if padding is needed), then add 0x21 to each 6 bit value and output it as an 8 bit value. The decoder would subtract 0x21 from each 8 bit value to produce the 6 bit value, then pack the 6 bit values to create an array of 16 bit values. You could use a special 8 bit value like 0x61 (decimal 64 + hex 21) to indicate the end of an encoded string. – rcgldr Jul 28 '14 at 22:50
  • @GregS - packed 6 bit values expanded to 8 bit values within the range of displayable ASCII characters by adding a value to them. Wiki example adds 0x41 ('A') to 6 bit values to produce 8 bit ASCII printable characters: [wiki base64](http://en.wikipedia.org/wiki/Base64) . In the wiki example, '=' (hex 3d) is used for padding at the end of the encoded string. – rcgldr Jul 28 '14 at 23:07
  • @GregS - I should have clarified. Note the original post asks for something like base64, not exactly base64, just printable. Using consecutive values would allow for encoding and decoding to be performed with add and subtract, as opposed to using a lookup / mapping table. – rcgldr Jul 28 '14 at 23:15
  • @adam10603 - Actual implementation of [Base64](http://en.wikipedia.org/wiki/Base64) normally uses two mapping tables: a 64 entry table indexed by 6 bit value with 8 bit values for encoding, and a 256 entry table indexed by encoded character with 6 bit values for decoding, plus some special handling for possible padding at the end of a string. The encoding table includes some "gaps" and doesn't map to consecutive values. If all you need is "printable" (displayable) characters, then you can encode to consecutive values by adding a constant ('!' or 'A') and decode by subtracting that constant. – rcgldr Jul 28 '14 at 23:53
0

See previous question here: base64 decode snippet in c++

Cast uint16_t* to unsigned const char* and encode, like so:

// Data to base64 encode
std::vector<uint16_t> some_data;

// Populate some_data...
// ...

// base64 encode it
std::string base64_data = base64_encode((unsigned char const*)&some_data[0], some_data.size()*2 );
Community
  • 1
  • 1
pzed
  • 817
  • 5
  • 8