Metaprogramming for optimizing storage/run time algorithm, C++

Question

I'd like to generalize bitwise operators in C++ without thinking that the underlying structure is an array.

As instance... if I want to represent 86 bits i would use a structure structure/class like:

typedef struct {
 uint64_t x[1];
 uint16_t y[1];
 uint8_t z[1];
} sampleStruct;

Instead if i would like to allocate 160 bits I would use a structure like:

typedef struct {
 uint64_t x[2];
 uint32_t y[1];
} sampleStruct;

I guess a trivial, but not optimal solution for the storage would be to assume all chunks are uniform and allocate the minimum of those s.t. it covers the size I'm implementing, however even for a matter of exercise I prefer the way I exposed.

To me it sounds clear that I should use metaprogramming to solve the problem, so I have to properly define

template <int I>
typedef sampleStruct {
  //something
}

However I'm not a big expert on C++ template metaprogramming so i would like to understand what would be the best way to implement the different kind of sample struct varing I. I know how to decide the best "cover" for my length it would be something like:

N64 = I/64;
RemN = I%64;
if(0 < RemN <= 8) {
  add uint8_t var;
} else if (8 < RemN <= 16) {
  add uint16_t var;
} else if (16 < RemN <= 24) {
  add uint16_t var;
  add uint8_t var;
} else {
  //Similarly handle the other cases from 24 < RemN < 64
}

What can I do to achieve what I want to do?

I also guess that arraging chunks correctly would allow to achieve slightly better performance compared to other possible implementation.

Hoping it is clear enough... (Assume C++11 or more recent versions).

Sounds like you want [`std::bitset`](http://en.cppreference.com/w/cpp/utility/bitset). — Daniel, Mar 31 '16 at 10:34
I didn't know it exists, does it work in the way I described? (the storage i mean). — user8469759, Mar 31 '16 at 10:37
there's a talk on this: https://www.youtube.com/watch?v=ea5DiCg8HOY — Richard Hodges, Mar 31 '16 at 10:40
@lukkio: Look at [padding](http://www.drdobbs.com/cpp/padding-and-rearranging-structure-member/240007649). — Jarod42, Mar 31 '16 at 14:07

score 0 · Answer 1 · edited May 23 '17 at 12:15

It's possible, but there's a fair amount of typing involved. The problem is that C++ doesn't provide a way to meta-programmatically omit a data member (see e.g. Conditional Inclusion/Exclusion of Data Members Inside Class Templates) so you have to specialize on its presence or absence:

template<int N64, bool P32, bool P16, bool P8>
struct sampleStructImpl;

template<int I>
using sampleStruct = sampleStructImpl<I/64, (I%64 >= 32), (I%32 >= 16), (I%16 >= 8)>;

The various partial specializations (8 in total) would look like the following:

template<int N64>
struct sampleStructImpl<N64, true, true, true>
{
  std::uint64_t x[N64];
  std::uint32_t y;
  std::uint16_t z;
  std::uint8_t w;
};

template<int N64>
struct sampleStructImpl<N64, true, true, false>
{
  std::uint64_t x[N64];
  std::uint32_t y;
  std::uint16_t z;
  // omit std::uint8_t w;
};

// 6 more partial specializations...

Also, since zero-length arrays are illegal, if you want to be able to allow values of I less than 64 you'll have to specialize on N64 being zero:

template<>
struct sampleStructImpl<0, true, true, true>
{
  // omit std::uint64_t x[0];
  std::uint32_t y;
  std::uint16_t z;
  std::uint8_t w;
};

// 7 more specializations...

It'd be a lot more straightforward to use std::array<std::uint8_t, (I + 7) / 8>, possibly with an alignas modifier to 64-bit align it.

Nim · Answer 2 · 2016-03-31T11:46:59.497

0

Wouldn't it be easier to use just an array of uint8_t, for example:

template <int I>
struct sampleStruct {
  std::uint8_t v[(I % 8)? (I / 8) + 1 : (I / 8)];
};

As you mentioned bits, I'm assuming that you don't access the individual members x,y,z, (it's not clear from the question how you would access the underlying bits..)

It's possible to do what you want as well, but you have to use inheritance, as follows:

template <int I>
struct largeBlock {
    std::uint64_t x[I];
};
template <>
struct largeBlock<0> {
};

template <int I>
struct mediumBlock {
    std::uint16_t y[I];
};
template <>
struct mediumBlock<0> {
};

template <int I>
struct smallBlock {
    std::uint8_t z[(I / 8) + ((I % 8) ? 1 : 0) ];
};
template <>
struct smallBlock<0> {
};

template <int I>
struct sampleStruct : largeBlock<I / 64>, mediumBlock<(I % 64) / 16>, smallBlock<(I % 16)> {

};

Now your operations have to be implemented in terms of calls to the base objects...

edited Mar 31 '16 at 11:46

answered Mar 31 '16 at 10:59

Nim

33,299
2
62
101

It definitely allows to achieve what i want to do, in terms of storage it would be the minimum required. However there would be a computational difference if you want to implement a shift operator as instance. – user8469759 Mar 31 '16 at 11:04
The above approach is also efficient for storage (which is what you were after), with your setup and padding, your structures will be quite large (for example 16 bytes for 86 bits vs 11) - so for storage optimization the above is hard to beat.. If you rephrase your question to consider run-time operation cost, then it's possible too.. – Nim Mar 31 '16 at 11:45
Why my implementation for 86 bits would use 16 bytes? It is 64 + 16 + 8, i.e. 8 + 2 + 1 bytes = 11 bytes. I was thinking of very large structures, If you take a boundary case, let's say 1024 bit i need 16 uint64_t variables, against 128 uint8_t, I i do a 1 bit right shift i would cycle on 16 uint64_t variables against 128 uint8_t, I think every that every optimizaton you do on 8 bit chunk would be the same in 64 bit chunk in this case. Another trivial case... let's say you want to allocate 8 byte, in your case you would use 4 byte to do that, am I right? Would that be better or equivalent? – user8469759 Mar 31 '16 at 12:13
By the way.. I already used your approach for allocation, it is good and easy to handle of course however I think C++ template could allow me to achieve better performance, mine is just a guess. – user8469759 Mar 31 '16 at 12:22
(You haven't considering the padding - which is why it's 16 bytes.) As for the operations, there is nothing stopping you from storing in an array of `uint8_t`, yet operating on it as if it were 64/16/8 bit values (by casting..) – Nim Mar 31 '16 at 12:32
Slightly unrelated, why there's such padding? I thought allocation in structure and classes just take the size of each variable and allocate that block in memory consecutively. – user8469759 Mar 31 '16 at 12:35

Metaprogramming for optimizing storage/run time algorithm, C++

2 Answers2