1

I want to create a function, say pack(), that takes a variable list of arguments and converts them to a series of bytes, e.g., std::vector.

Given char c = 0x10, int x = 4, char *s = "AAA", then pack(), should behave like:

pack(c, x, s) = 0x10, 0x04, 0x00, 0x00, 0x00, 0x41, 0x41, 0x41.

(here I assume little-endian byte ordering)

How would I program such function?

I've been thinking about C's va_list or C++'s template mechanisms, but I've trouble implementing this.

What is the "best" way of programming such function? Any code snippets demonstrating a suitable technique?

Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
Shuzheng
  • 11,288
  • 20
  • 88
  • 186
  • 1
    Open your C++ book to the chapter that describes variadic templates, and start reading. This is one of the most complicated parts of C++, that cannot be fully addressed in one or two paragraphs in a stackoverflow.com answer. The short answer is: learn how to use variadic templates. – Sam Varshavchik Aug 20 '17 at 17:12
  • But given such template, how would I get the type of the arguments such that I can extract the bytes using, e.g., a `char *` pointer? – Shuzheng Aug 20 '17 at 17:14
  • You can implement a streaming operator and overload it. – Iharob Al Asimi Aug 20 '17 at 17:15
  • 1
    This should be explained in the aforementioned chapter in your C++, that should have plenty of examples of implementing recursive template functions that collect their arguments into a container, of some kind. – Sam Varshavchik Aug 20 '17 at 17:15
  • Note that variadic templates appeared in C++11 only, so the book should be fairly modern. – yeputons Aug 20 '17 at 17:22

3 Answers3

4

You could do:

void pack_in_vector(std::vector<std::uint8_t>& v, char c)
{
    v.push_back(c);
}

void pack_in_vector(std::vector<std::uint8_t>& v, int n)
{
    v.push_back(n & 0xFF);
    v.push_back((n >> 8) & 0xFF);
    v.push_back((n >> 16) & 0xFF);
    v.push_back((n >> 24) & 0xFF);
}

void pack_in_vector(std::vector<std::uint8_t>& v, const std::string& s)
{
    for (c : s) {
        v.push_back(c);    
    }
}

template <typename ... Ts>
std::vector<std::uint8_t> pack(const Ts&... args)
{
    std::vector<std::uint8_t> bytes;
    (pack_in_vector(bytes, args), ...); // Folding expression requires C++17
    return bytes;
}

For C++11, you have to modify last function to:

template <typename ... Ts>
std::vector<std::uint8_t> pack(const Ts&... args)
{
    std::vector<std::uint8_t> bytes;

    int dummy[] = {0, (pack_in_vector(bytes, args), 0)...};
    static_cast<void>(dummy); // avoid warning for unused variable
    return bytes;
}
WindyFields
  • 2,697
  • 1
  • 18
  • 21
Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • Thanks, I really like this. Is there some way of modifying it, so that it compiles with C++11? – Shuzheng Aug 20 '17 at 17:28
  • For extra points, you could do `bytes.reserve((sizeof(args) + ...));`. – HolyBlackCat Aug 20 '17 at 17:30
  • @HolyBlackCat: Which is not necessary accurate: `sizeof(std::string)` – Jarod42 Aug 20 '17 at 17:34
  • @jarod42 it doesn't have to be perfect. In addition, a `byteof` function could be written that would make it perfect. Me, I'd solve this problem with a byte-sink pattern, then write the reserve and push back in terms of a byte-sink. As a bonus, recursive byte-sinking of containers could fall out. – Yakk - Adam Nevraumont Aug 20 '17 at 17:42
1
template<class F>
void range_to_bytes( F&& f, char const* begin, char const* end ){
  for(auto*it=begin; it != end; ++it)
    f(*it);
}
template<class F>
void to_bytes( F&& f, char c ){
  range_to_bytes(f, &c, &c+1);
}
template<class F>
void to_bytes( F&& f, int i ){
  range_to_bytes(f, (const char*)(&i), (const char*)(&i+1));
}
template<class F>
void to_bytes( F&& f, char const* str ){
  range_to_bytes(f, str, str+strlen(str));
}
template<class F, class...Ts>
void to_bytes( F&& f, Ts const&... ts ){
  using discard=int[];
  (void)discard{0,(void(
    to_bytes(f, ts)
  ),0)...}
}
template<class...Ts>
std::vector<char> to_vector_bytes( Ts const&... ts ){
  std::size_t count = 0;
  to_bytes([&](char){++count;}, ts...);
  std::vector<char> r;
  r.reserve(count);
  to_bytes([&](char c){r.push_back(c);}, ts...);
  return r;
}
Jarod42
  • 203,559
  • 14
  • 181
  • 302
Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
1

Let me share my solution. Its advantage over the previously proposed once is that it work for all types: fundamental types, static arrays, custom objects, containers (vector, list, string...), C-strings (both literal and dynamically allocated).

If you want to limit those types (say, not to allow packing pointers) you can always add more SFINAE :) Or just a static_assert...

// byte_pack.h

#include <vector>
#include <type_traits>

// a small trait to check if it is possible to iterate over T
template<typename T, typename = void>
constexpr bool is_iterable = false;

template<typename T>
constexpr bool is_iterable<T, decltype(
        std::begin(std::declval<T&>()) != std::end(std::declval<T&>()), void())> = true;

typedef std::vector<std::uint8_t> byte_pack; // vector of bytes itself

template<typename T, std::enable_if_t<(!is_iterable<T>)>* = nullptr>
void pack(byte_pack& bytes, const T& value)  // for not iteratable values (int, double, custom objects, etc.)
{
    typedef const std::uint8_t byte_array[sizeof value];
    for(auto& byte : reinterpret_cast<byte_array&>(value)) {
        bytes.push_back(byte);
    }
}

template<typename T, std::enable_if_t<is_iterable<T>>* = nullptr>
void pack(byte_pack& bytes, const T& values) // for iteratable values (string, vector, etc.)
{
    for(const auto& value : values) {
        pack(bytes, value);
    }
}

template<>
inline void pack(byte_pack& bytes, const char* const & c_str) // for C-strings
{
    for(auto i = 0; c_str[i]; ++i) {
        bytes.push_back(c_str[i]);
    }
}

template<>
inline void pack(byte_pack& bytes, char* const & c_str) { // for C-strings
    pack(bytes, static_cast<const char*>(c_str));
}

template<typename T, size_t N>
void pack(byte_pack& bytes, const T (&values) [N])  // for static arrays
{
    for(auto i = 0u; i < N; ++i) {
        pack(bytes, values[i]);
    }
}

// finally a variadic overload
template<typename... Args>
byte_pack pack(const Args&... args)
{
    byte_pack bytes;
    int dummy[] = { 0, (pack(bytes, args), 0) ... };
    return bytes;
}

Tests:

#include "byte_pack.h"

void cout_bytes(const std::vector<std::uint8_t>& bytes)
{
    for(unsigned byte : bytes) {
        std::cout << "0x" << std::setfill('0') << std::setw(2) << std::hex
                   << byte << " ";
    }
    std::cout << std::endl;
}

int main()
{
    // your example
    char c = 0x10; int x = 4; const char* s = "AAA";
    cout_bytes(pack(c, x, s));

    // static arrays and iterateble objects
    char                            matrix1[2][2] = { {0x01, 0x01},  {0xff, 0xff} };
    std::vector<std::vector<char>>  matrix2       = { {(char) 0x01, (char) 0x01},  {(char) 0xff, (char) 0xff} };
    cout_bytes(pack(matrix1, matrix2));

    // strings
    char*       str2 = new char[4] { "AAA" };
    std::string str1 = "AAA";
    cout_bytes(pack(str1, str2));

    // custom objects (remember about alignment!)
    struct { char a = 0x01;     short b = 0xff; }   object1;
    struct { short a = 0x01ff;  char b = 0x01; }    object2;
    cout_bytes(pack(object1, object2));

    return 0;
}

Output:

0x10 0x04 0x00 0x00 0x00 0x41 0x41 0x41
0x01 0x01 0xff 0xff 0x01 0x01 0xff 0xff
0x41 0x41 0x41 0x41 0x41 0x41
0x01 0x00 0xff 0x00 0xff 0x01 0x01 0x00
WindyFields
  • 2,697
  • 1
  • 18
  • 21
  • Does the "&" in `char* const & c_str` mean a reference? – Shuzheng Aug 21 '17 at 15:36
  • Ah) because my primary template takes `const T& value`. So if I write just `char*` it is an invalid specialization (can be only an overload). But if I make it an overload, overload priority starts to play and on passing an array of chars `chars[]`, this overload gets called. However, it is intended for c-strings only, so I will get a wrong result. – WindyFields Aug 21 '17 at 15:43
  • If you are not going to pass any heavy, non-itrateble objects, than you can just replace `const T& value` with `T value` and then `char*` and `const char*` (without that weird `const &`) will work just fine. – WindyFields Aug 21 '17 at 15:53
  • Ahh thanks, - what if I need to use the templates in several source files? Will I need to inline the implementations then? I need to define the templates in the same header as the declarations, right? – Shuzheng Aug 21 '17 at 16:08
  • Now, you can just put all this code in a separate header and it can be used as a header-only implementation. (Don't forget about include guards or `#pragma once`) – WindyFields Aug 21 '17 at 16:19
  • Why are only some of the template definitions inline? Are there alternatives to inline? – Shuzheng Aug 21 '17 at 17:09
  • Hmm, what is wrong with `inline`? All unspecialized templates already may get inlined by default. However, I put inline near full specializations because otherwise I will have to move them to a separate cpp file while including this header in more that one place (one definition rule). Here are some useful links: [inline templates](https://stackoverflow.com/questions/10535667/does-it-make-any-sense-to-use-inline-keyword-with-templates), [inline full template specializations](https://stackoverflow.com/questions/17667098/inline-template-function). – WindyFields Aug 21 '17 at 17:47
  • When the compiler use unspecialized templates to create functions "under the hood", i.e., the template is invoked from some .cpp file -- does the compiler then name the template-derived (created) functions so that multiple definitions are avoided? Or say a template, `pack()`, is invoked with the same types of parameter from two .cpp files - will this result in a multiple definition error, or is the compiler smart enough to make either one definition or two definitions with different names? – Shuzheng Aug 22 '17 at 09:05
  • @Shuzheng, yes, of course! that is the basics of templates usage. [Read this, please.](https://stackoverflow.com/questions/6200752/c-header-only-template-library). Put all templates in a header is standard practice. Compiler will figure everything out. – WindyFields Aug 22 '17 at 09:19