0

I am wondering if it is possible to convert vector of pairs into a byte array.

Here's a small example of creating the vector of pairs:

int main(int argc, char *argv[])
{
    PBYTE FileData, FileData2, FileData3;
    DWORD FileSize, FileSize2, FileSize3;

    /* Here I read 3 files + their sizes and fill the above variables. */

    //Here I create the vector of std::pairs.
    std::vector<std::pair<PBYTE, DWORD>> DataVector
    {
        { FileData, FileSize }, //Pair contains always file data + file size.
        { FileData2, FileSize2 },
        { FileData3, FileSize3 }
    };

    std::cin.ignore(2);
    return 0;
}

Is it possible to convert this vector into a byte array (for compressing, and writing to disk, etc)?

Here is what I tried, but I didn't get even the size correctly:

PVOID DataVectorArr = NULL;

DWORD DataVectorArrSize = DataVector.size() * sizeof DataVector[0];

if ((DataVectorArr = malloc(DataVectorArrSize)) != NULL)
{
    memcpy(DataVectorArr, &DataVector[0], DataVectorArrSize);
}

std::cout << DataVectorArrSize;

//... Here I tried to write the DataVectorArr to disk, which obviously fails because the size isn't correct. I am not also sure if the DataVectorArr contains the DataVector now.

if (DataVectorArr != NULL) delete DataVectorArr;

Enough code. Is is it even possible, or am I doing it wrong? If I am doing it wrong, what would be the solution?

Regards, Okkaaj

Edit: If it's unclear what I am trying to do, read the following (which I commented earlier):

Yes, I am trying to cast the vector of pairs to a PCHAR or PBYTE - so I can store it to disk using WriteFile. After it is stored, I can read it from disk as byte array, and parse back to vector of pairs. Is this possible? I got the idea from converting / casting struct to a byte array and back(read more from here: Converting struct to byte and back to struct) but I am not sure if this is possible with std::vector instead of structures.

Community
  • 1
  • 1
Okkaaj
  • 115
  • 8
  • Shall this byte array be a contiguous region of all pairs described in your pair-vector, in order? – WhozCraig Sep 29 '14 at 18:35
  • 1
    `sizeof DataVector[0]` will give you the same as `sizeof(std::pair)` which probably isn't what you want here. – πάντα ῥεῖ Sep 29 '14 at 18:36
  • What do you mean by 'size is not correct'? What is the expected and actual output? – kraskevich Sep 29 '14 at 18:37
  • @user2040251 I am getting 24 bytes as the size, which cannot be true when I have read bytes of 3 files into the vector. I am expecting a size of 3 files + size of the DWORDS (FileSize1, etc) and the size of the vector. Sorry if this is a bad explanation, I am very tired... – Okkaaj Sep 29 '14 at 18:53
  • it's a bit unclear what you're really trying to achieve. do you want the contents of the tree files concatenated and stored somewhere? – Cheers and hth. - Alf Sep 29 '14 at 19:12
  • @Cheersandhth.-Alf Exactly. Yes, I am trying to cast the vector of pairs to a `PCHAR` or `PBYTE` - so I can store it to disk using WriteFile. After it is stored, I can read it from disk as byte array, and parse back to vector of pairs. Is this possible? I got the idea from converting / casting `struct` to a byte array and back(read more from here: http://stackoverflow.com/questions/13775893/converting-struct-to-byte-and-back-to-struct) but I am not sure if this is possible with std::vector instead of structures. – Okkaaj Sep 29 '14 at 19:18
  • If you want code to convert some data structure into an array of characters or bytes, why don't you write some code to do that? Decide on a format, document it, and then write code that actually does it. – David Schwartz Sep 29 '14 at 22:09

1 Answers1

3

Get rid of the malloc and make use of RAII for this:

std::vector<BYTE> bytes;
for (auto const& x : DataVector)
    bytes.insert(bytes.end(), x.first, x.first+x.second);

// bytes now contains all images buttressed end-to-end.
std::cout << bytes.size() << '\n';

To avoid potential resize slow-lanes, you can enumerate the size calculation first, then .reserve() the space ahead of time:

std::size_t total_len = 0;
for (auto const& x : DataVector)
    total_len += x.second;

std::vector<BYTE> bytes;
bytes.reserve(total_len);
for (auto const& x : DataVector)
    bytes.insert(bytes.end(), x.first, x.first+x.second);

// bytes now contains all images buttressed end-to-end.
std::cout << bytes.size() << '\n';

But if all you want to do is dump these contiguously to disk, then why not simply:

std::ofstream outp("outfile.bin", std::ios::out|std::ios::binary);
for (auto const& x : DataVector)
    outp.write(static_cast<const char*>(x.first), x.second);
outp.close();

skipping the middle man entirely.

And honestly, unless there is a good reason to do otherwise, it is highly likely your DataVector would be better off as simply a std::vector< std::vector<BYTE> > in the first place.


Update

If recovery is needed, you can't just do this as above. The minimal artifact that is missing is the description of the data itself. In this case the description is the actual length of each pair segment. To accomplish that the length must be stored along with the data. Doing that is trivial unless you also need it portable to platform-independence.

If that last sentence made you raise your brow, consider the problems with doing something as simple as this:

std::ofstream outp("outfile.bin", std::ios::out|std::ios::binary);
for (auto const& x : DataVector)
{
    uint64_t len = static_cast<uint64_t>(x.first);
    outp.write(reinterpret_cast<const char *>(&len), sizeof(len));
    outp.write(static_cast<const char*>(x.first), x.second);
}
outp.close();

Well, now you can read each file by doing this:

  • Read a uint64_t to obtain the byte length of the data to follow
  • Read the data of that length

But this has inherent problems. It isn't portable at all. The endian-representation of the reader's platform had better match that of the writer, or this is utter fail. To accommodate this limitation the length preamble must be written in a platform-independent manner, which is tedious and a foundational reason why serialization libraries and their protocols exit in the first place.

If you haven't second-guessed what you're doing and how you're doing it by this point, you may want to read this again.

WhozCraig
  • 65,258
  • 11
  • 75
  • 141
  • 1
    @WhozCraig: it's not at all clear that the OP wants the concatenation of the contents of the three files. at least to me. as I see it sounds like some XY problem: real problem X, asking for implementation problem with imagined solution Y, where Y is not necessarily meaningful as a solution to X. – Cheers and hth. - Alf Sep 29 '14 at 19:10
  • What's the point of writing data this way if you can't read it back properly? – piotrekg2 Sep 29 '14 at 20:11
  • @Cheersandhth.-Alf I concur. Why anyone would do this is beyond me, but it is none-the-less doable (obviously). I would surmise a serialization library (or at least a sound protocol and hand-rolled code on both sides to enforce it) would be what is *really* needed, but with the OP's stated desire its hard saying. If compression were the end-game, why not use a zip-format-wrapped zlib implementation, for example (which is readily available to the masses). I'm thinking you're likely right, that the *real* problem will still surface once this nuance is deployed. – WhozCraig Sep 29 '14 at 21:36