22

I am facing a small problem. I have a struct, which has a vector. Note that the vector is dynamic per every iteration. Now, in a particular iteration, how do I store the struct which contains a vector of size n to a binary file?

Also, when retrieving, assume that I know how the size of the vector, how to I retrieve from the binary file, the struct variable containing the vector of all the stored elements?

I am able to store something to the binary file (as I can see the size increasing when writing), but when I am trying to retrieve back the elements, I am getting size of vector to be zero.

Unfortunately, I have to achieve this using the standard STL and not use any third-party libraries.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
Shankar Raju
  • 4,356
  • 6
  • 33
  • 52

2 Answers2

27

You should have a look at Boost Serialization.

If you can't use 3rd party libraries, you must know that C++ doesn't support serialization directly. This means you'll have to do it yourself.

This article shows a nice way of serializing a custom object to the disk and retrieving it back. And this tutorial shows you how to get started right now with fstream.

This is my attempt:

EDIT: since the OP asked how to store/retrieve more than record I decided to updated the original code.

So, what changed? Now there's an array student_t apprentice[3]; to store information of 3 students. The entire array is serialized to the disk and then it's all loaded back to the RAM where reading/searching for specific records is possible. Note that this is a very very small file (84 bytes). I do not suggest this approach when searching records on huge files.

#include <fstream>
#include <iostream>
#include <vector>
#include <string.h>

using namespace std;


typedef struct student
{
    char name[10];
    int age;
    vector<int> grades;
}student_t;

int main()
{
    student_t apprentice[3];  
    strcpy(apprentice[0].name, "john");
    apprentice[0].age = 21;
    apprentice[0].grades.push_back(1);
    apprentice[0].grades.push_back(3);
    apprentice[0].grades.push_back(5);    

    strcpy(apprentice[1].name, "jerry");
    apprentice[1].age = 22;
    apprentice[1].grades.push_back(2);
    apprentice[1].grades.push_back(4);
    apprentice[1].grades.push_back(6);

    strcpy(apprentice[2].name, "jimmy");
    apprentice[2].age = 23;
    apprentice[2].grades.push_back(8);
    apprentice[2].grades.push_back(9);
    apprentice[2].grades.push_back(10);

    // Serializing struct to student.data
    ofstream output_file("students.data", ios::binary);
    output_file.write((char*)&apprentice, sizeof(apprentice));
    output_file.close();

    // Reading from it
    ifstream input_file("students.data", ios::binary);
    student_t master[3];
    input_file.read((char*)&master, sizeof(master));         

    for (size_t idx = 0; idx < 3; idx++)
    {
        // If you wanted to search for specific records, 
        // you should do it here! if (idx == 2) ...

        cout << "Record #" << idx << endl;
        cout << "Name: " << master[idx].name << endl;
        cout << "Age: " << master[idx].age << endl;
        cout << "Grades: " << endl;
        for (size_t i = 0; i < master[idx].grades.size(); i++)
           cout << master[idx].grades[i] << " ";
        cout << endl << endl;
    }

    return 0;
}

Outputs:

Record #0
Name: john
Age: 21
Grades: 
1 3 5 

Record #1
Name: jerry
Age: 22
Grades: 
2 4 6 

Record #2
Name: jimmy
Age: 23
Grades: 
8 9 10

Dump of the binary file:

$ hexdump -c students.data 
0000000   j   o   h   n  \0 237   {  \0   �   �   {   � 025  \0  \0  \0
0000010   (   �   �  \b   4   �   �  \b   8   �   �  \b   j   e   r   r
0000020   y  \0   �  \0   �   �   |  \0 026  \0  \0  \0   @   �   �  \b
0000030   L   �   �  \b   P   �   �  \b   j   i   m   m   y  \0  \0  \0
0000040   �   6   �  \0 027  \0  \0  \0   X   �   �  \b   d   �   �  \b
0000050   h   �   �  \b                                                
0000054
karlphillip
  • 92,053
  • 36
  • 243
  • 426
  • Also, take a look at: http://stackoverflow.com/questions/523872/how-do-you-serialize-an-object-in-c/523882#523882 – karlphillip Mar 31 '11 at 21:39
  • Unfortunately, I have to achieve this using the standard STL and not use any third-party libraries. – Shankar Raju Mar 31 '11 at 21:40
  • Originally, I only posted about Boost Serialization without realizing the author said he couldn't use 3rd party libraries. – karlphillip Mar 31 '11 at 21:57
  • 7
    You don't usually have any chance of struct dump of a std::vector<> working. [I suppose it _might_ work for (very) small vectors on some implementations!] – Keith Mar 31 '11 at 22:47
  • 1
    Hey thank you so much Karlphillip. I'm sorry, earlier since you gave suggestion about Boost library, I thought if I had edited my question, others would not give 3rd party library suggestions. – Shankar Raju Apr 01 '11 at 02:48
  • I have a question though. Suppose, I have like 25-30 student details in the binary file (from the example shown above), how can I read, say 10th student detail from the file? Please reply, thanks – Shankar Raju Apr 01 '11 at 02:49
  • If I put the reading and writing in to the separate functions, this breaks. it doesn't work. – Muneer Apr 09 '14 at 11:02
  • Maybe your code has a problem since this has been tested and used by several people before you? – karlphillip Apr 09 '14 at 14:33
  • 8
    This code is horribly broken. It appears to work in your tests only because you're reading the data back into the same process that wrote it, so the pointers read from the disk still point to the same objects. The object data was never saved correctly, and when the original objects are destroyed (perhaps by process restart), the array read back from the file will stop working. – Ben Voigt Jun 13 '15 at 20:29
  • @BenVoigt Is it still broken? Would it work if one just saved one student, i.e. if one avoid saving an array? Or is the `vector grades;` the problem? Would it be any different if there was only one grade, i.e. int instead of vector ? – langlauf.io Apr 18 '16 at 15:42
  • 2
    @stackoverflowwww: The `vector` is indeed the problem, it contains a capacity, count, and pointer to the real data. Saving that pointer to disk is meaningless, you want to save the data it points to. Jerry's answer explains a good approach for doing that. – Ben Voigt Apr 18 '16 at 15:52
  • @Keith: "small string optimization" is illegal for `std::vector`, because the required iterator-validity constraints on `swap` forbid moving the elements. – Ben Voigt Apr 18 '16 at 15:57
  • 4
    Downvoted, because "serializing" a `std::vector` this way is completely nonsense. – Daniel Jour Sep 18 '16 at 08:46
20

You typically serialize a vector by writing the length of the vector, followed by that number of elements. When you read it back in, having the length come first lets you know how many more items to read as part of that vector. As a simple first approximation, consider something like this:

template<class T>
std::ostream &operator<<(std::ostream &output, T const &input) {
    T::size_type size = input.size();

    output << size << "\n";
    std::copy(input.begin(), input.end(), 
         std::ostream_iterator<T::value_type>(output, "\n"));

    return output;
}

template<class T>
std::istream &operator>>(std::istream &input, T &output) {
    T::size_type size, i;

    input >> size;
    output.resize(size);
    std::copy_n(
        std::istream_iterator<t::value_type>(input),
        size,
        output.begin());

    return input;
}

This is open to lots of tweaks, improvements, and simple modifications -- just for example, for the moment, I've passed the vector (or whatever -- could be a std::deque, etc.) by reference rather than passing iterators. That probably simplifies most use, but doesn't fit as well with the rest of the library.

This also serializes in text format, one number per line. Discussions comparing text to binary have happened before, so I won't try to repeat all the arguments here -- I'll just note that the same basic idea can be done in binary format just as well as text.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • Thank you Jerry. I had a similar implementation where I serialized a vector containing in a struct to a binary file, but the size of vector was retrieved as zero when reading, anyways let me revisit my solution and I will get back to you if I need some clarification. Thanks. – Shankar Raju Apr 01 '11 at 02:53
  • Capitalize the T in the second function: `std::istream_iterator(input),` (tried to edit and fix, but it would not take it.) – jmcarter9t Jul 08 '19 at 22:28