1

Let's say I have a struct using bit-fields like that :

struct SomeData
{
  char someChar:5;
  char someSmallerChar:3;
}

The size of the content of one SomeData should be one char long instead of two, thanks to bitfields.

Now if I want to organize my data this way...

class SomeDataContainer
{
  std::vector<char> someChar;
  std::vector<char> someSmallerChar;
}

... I lose the benefit I had with the bit-fields regarding space efficiency. The size of the equivalent container is now twice the original one.

Is there a way to create a vector of char:5 and char:3 or something similar to it to get the same benefits while having the data in this vector format (contiguous in memory) ?

Norgannon
  • 487
  • 4
  • 16
  • std::vector doesn't fit your needs because continuity, right? – Jeffrey Jul 15 '19 at 13:06
  • Yes, I need `someChar`'s to be contiguous and `std::vector` would put `someSmallerChar`'s in the middle. – Norgannon Jul 15 '19 at 13:07
  • 1
    Of course there is a way: you just have to implement your own container that implements this functionality. There's nothing like this in the standard C++ library, but you can always write your own, and if it complies sufficiently with C++ library's container requirements, it should be usable with the rest of the library (algorithms, etc...) – Sam Varshavchik Jul 15 '19 at 13:08
  • 1
    There is `std::vector` which has bit-level granularity (and `std::bitset` with a compile-time length). The next higher granularity provided by the standard library is `std::vector` (or similar). You could write your own container that allows granularity at the level you need. – Max Langhof Jul 15 '19 at 13:09
  • @SamVarshavchik The "complies sufficiently" might be difficult, given what the `std::vector` disaster has taught us. – Max Langhof Jul 15 '19 at 13:10
  • 1
    @max The `std::vector` disaster is solely because it is a specialization, and not a completely different container – Sam Varshavchik Jul 15 '19 at 13:12
  • I'm afraid to not be able to make an efficient enough container by myself. I'd hope that boost or another had already implemented an efficient container with this granularity. Based on what you say I guess there is nothing available right now. – Norgannon Jul 15 '19 at 13:18
  • Have a look at these answers. https://stackoverflow.com/questions/2633400/c-c-efficient-bit-array – stark Jul 15 '19 at 13:24
  • Actually there is. Use same `char` variable and toggle bits in it by using bitwise operations like `>> << &`. Then just store same variables in your vector. – Cactus'as Jul 15 '19 at 13:27
  • Why do you need `someChar`'s to be contigious? A `std::vector` would allow you to iterate all of them as if they were right next to each other. – NathanOliver Jul 15 '19 at 13:28
  • @MaxLanghof How is `std::vector` not space efficient? It packs both chars in each element of the vector. – NathanOliver Jul 15 '19 at 13:32
  • @NathanOliver If you want to use only the `someChar` element for some computation and have to iterate through the vector, if there is `someSmallerChar` in the middle you lose efficiency when iterating other the vector. (You'll have more cache miss and lose time. However, yes, strictly speaking of global storage, it is equivalent. It is not however if you only want to access one of the elements though as you will load in cache memory useless data.) – Norgannon Jul 15 '19 at 13:34
  • @Norgannon Have you actually measured that? AFAIK you have a little penalty for reading the bitset but that is the cost of using bitsets. The data is still "contigious" so it's not like you are going to get cache misses when iterating. – NathanOliver Jul 15 '19 at 13:36
  • @NathanOliver I am not sure. I am only assuming parts of it. – Norgannon Jul 15 '19 at 13:38
  • @Norgannon If I were you I would start with a `std::vector` in `SomeDataContainer`. Then profile and see what your performance is. It should be the fastest space efficient method. – NathanOliver Jul 15 '19 at 13:41
  • @NathanOliver Maybe my vocabulary is wrong but if the vector is too big to fit in the CPU cache, wouldn't you get a cache miss ? Reducing the size makes it happen less often, doesn't it ? – Norgannon Jul 15 '19 at 13:43
  • @Norgannon No, that's not how it works. The vector, and the data the vector contains are in two different places. Once you start reading data from the vector it starts loading the cache since it assumes you'll keep reading from the vector. You might get a cache miss on the first read but all subsequent reads should be pre-fetched into the cache for you. This is why vector is the defacto container. – NathanOliver Jul 15 '19 at 13:46
  • @NathanOliver I see. Well thanks a lot, I'll do some testing/profiling including with std::bitset as referred to in the question linked by [at]stark. – Norgannon Jul 15 '19 at 13:49
  • Have you looked into using #pragma pack to keep the continuity? – Sam P Jul 16 '19 at 13:16

0 Answers0