6

I write tools to dump and load common objects in a binary file. In a first quick implementation, I wrote the following code for std::vector<bool>. It works, but it is clearly not optimized in memory.

template <>
void binary_write(std::ofstream& fout, const std::vector<bool>& x)
{
    std::size_t n = x.size();
    fout.write((const char*)&n, sizeof(std::size_t));
    for(std::size_t i = 0; i < n; ++i)
    {
        bool xati = x.at(i);
        binary_write(fout, xati);
    }
}

template <>
void binary_read(std::ifstream& fin, std::vector<bool>& x)
{
    std::size_t n;
    fin.read((char*)&n, sizeof(std::size_t));
    x.resize(n);
    for(std::size_t i = 0; i < n; ++i)
    {
        bool xati;
        binary_read(fin, xati);
        x.at(i) = xati;
    }
}

How can I copy the internal memory of a std::vector<bool> in my stream ?

Note : I don't want to replace std::vector<bool> by something other.

Caduchon
  • 4,574
  • 4
  • 26
  • 67
  • Even if you are already using `std::vector` elsewhere in the code, I strongly suggest you move to something like `std::bitset` or `boost::dynamic_bitset` and use their `to_string` functionality, or their `ostream` overloads of `operator<<`. – rubenvb Apr 14 '15 at 09:29
  • `to_string` for a binary storage ? Really ? ^^ – Caduchon Apr 14 '15 at 09:31
  • 1
    Right, not my smartest comment ;). Still, after looking up the functionality of std::bitset, that seems like the only way to go (bitset->string->integer of some kind). That, or fetching the bits one by one. I'm curious which would be faster... Hmm on second thought, just stick with `std::vector` (see e.g. [this question](http://stackoverflow.com/a/13504640/256138)) – rubenvb Apr 14 '15 at 09:48
  • Make data persistent is the job of a serializer. No need to handcraft that. – Klaus Jun 03 '19 at 11:19
  • @Klaus: write a serializer with specific needs is my job. I don't need judgment on the relevance of the question. I need solutions. ;-) – Caduchon Jun 04 '19 at 08:26
  • in first [comment](https://stackoverflow.com/a/6485519/8428146) of this answer, as pointted out by that user **vector doesn't have a contiguous memory storage of bools)** – Y00 Oct 14 '19 at 23:23

2 Answers2

3

Answering my own question, currently validated as the best answer, but it can change if someone provides somthing better.

A way to do that is the following. It requires to access each value, but it works.

template <>
void binary_write(std::ofstream& fout, const std::vector<bool>& x)
{
    std::vector<bool>::size_type n = x.size();
    fout.write((const char*)&n, sizeof(std::vector<bool>::size_type));
    for(std::vector<bool>::size_type i = 0; i < n;)
    {
        unsigned char aggr = 0;
        for(unsigned char mask = 1; mask > 0 && i < n; ++i, mask <<= 1)
            if(x.at(i))
                aggr |= mask;
        fout.write((const char*)&aggr, sizeof(unsigned char));
    }
}

template <>
void binary_read(std::ifstream& fin, std::vector<bool>& x)
{
    std::vector<bool>::size_type n;
    fin.read((char*)&n, sizeof(std::vector<bool>::size_type));
    x.resize(n);
    for(std::vector<bool>::size_type i = 0; i < n;)
    {
        unsigned char aggr;
        fin.read((char*)&aggr, sizeof(unsigned char));
        for(unsigned char mask = 1; mask > 0 && i < n; ++i, mask <<= 1)
            x.at(i) = aggr & mask;
    }
}
Caduchon
  • 4,574
  • 4
  • 26
  • 67
  • Writing the size like that isn't endian safe. Also the size of an vector is a std::vector::size_type which isn't necessarily the same as an unsigned int. – Tom Tanner May 18 '16 at 12:19
  • Your right, but it seems quite impossible to have a size of an existing vector that overflow an `unsigned long int`, because `unsigned long int >= void* >= size of the RAM >= size of the vector`. In this specific case, bits are counted, then the number of elements coud be greater than the RAM, but the template constraint let met think that it's the same integral type used for all the vectors. In my case `size_type` is a `size_t` which is an uint on 64 bits, which is the same than `unsigned long int`. The bug can occur only for hypothetic compilers with a really huge vector of bools. – Caduchon May 18 '16 at 12:52
  • I modified the answer according to your comment. I hate to use size types when it's impacting the code out of the scope (for example when correlated to data from the user), because it become really unclear to manage for the developper. But it's not the case here. – Caduchon May 18 '16 at 12:58
1

Sorry but the answer is you can't do this portably.

To do this non-portably, you can write a function specific to your standard library implementation's iterators for vector<bool>.

If you're lucky, the relevant fields will be public inside the iterators, so you don't have to change private to public.

user541686
  • 205,094
  • 128
  • 528
  • 886
  • Actually, I can do this portably by agregating 8 values in 1 byte, and store this byte in my file. But I prefer a nice solution. :-) – Caduchon Apr 14 '15 at 09:40
  • 2
    @Caduchon: You have to access the vector's bits individually, though. My point was that you can't avoid that. – user541686 Apr 14 '15 at 09:44