1

I'm trying to write a structure to a binary file. The structure consists of strings and integers. If there were no strings I'd just take the entire object and write it to the binary file normally, but if I do that now, the string can be read easily as well.

So I decided to write each of the attribute of the structure separately. I was working with the strings the same way like it's mentioned in this Stackoverflow answer.

This is the main function that is supposed to save the structure into the name.bin file.

void saveFileBin(std::string nameOfFile) {
    Person people[SIZE_OF_ARRAY] =
    {Person("Name1", "lastName2", Address("Street1", "City1", "111"), Date(1, 1, 1111)),
    Person("Name2", "lastName2", Address("Street2", "City2", "222"), Date(2, 2, 2222)),
    Person("Name3", "lastName3", Address("Street3", "City3", "333"), Date(3, 3, 3333))};

    std::ofstream myFile(nameOfFile + ".bin", std::ios::binary);
    if (myFile.is_open())
    {
        for (int i = 0; i < SIZE_OF_ARRAY; i++)
        {
            people[i].write(&myFile);
        }
        myFile.close();

        std::cout << "The entire thing is in memory";
    }
    else 
    {
        throw std::exception("Unable to open file");
    }
};

and this is a function for writing each attribute into the file.

void Person::write(std::ofstream* out)
{
    out->write(_name.c_str(), _name.size());
    out->write(_lastName.c_str(), _lastName.size());
    out->write(_residence.getStreet().c_str(), _residence.getStreet().size());
    out->write(_residence.getZip().c_str(), _residence.getZip().size());
    out->write(_residence.getCity().c_str(), _residence.getCity().size());
    std::string day = std::to_string(_birthDate.getDay());
    std::string month = std::to_string(_birthDate.getMonth());
    std::string year = std::to_string(_birthDate.getYear());
    out->write(day.c_str(), day.size());
    out->write(month.c_str(), month.size());
    out->write(year.c_str(), year.size());
}

The resulting file has everything in plain text readable. Though if I instead in the main method call myFile.write((char*)people, sizeof(people)); then it shows up unreadable characters correctly but string variables are still normally read. That's why I went with converting into the string all variables and then writing it to a bin file altogether.

How come my method still shows all strings like it's not even binary file? Even though I have std::ios::binary as a parameter?

The output file contains this:

Name1lastName1Street11111City11111111Name2lastName2Street22222City22222222Name3lastName3Street33333City3333333

Whereas if I write the entire structure into the binary file, it looks like this:

     lÊ87  Name1 ÌÌÌÌÌÌÌÌÌÌ              à^Ê87  lastName1 ÌÌÌÌÌÌ                  Ð_Ê87  Street1 ÌÌÌÌÌÌÌÌ              ÐdÊ87  City1 ÌÌÌÌÌÌÌÌÌÌ               bÊ87  111 ÌÌÌÌÌÌÌÌÌÌÌÌ                    W  ÌÌÌÌ`kÊ87  Name2 ÌÌÌÌÌÌÌÌÌÌ              fÊ87  lastName2 ÌÌÌÌÌÌ                 €iÊ87  Street2 ÌÌÌÌÌÌÌÌ              PbÊ87  City2 ÌÌÌÌÌÌÌÌÌÌ              ÐiÊ87  222 ÌÌÌÌÌÌÌÌÌÌÌÌ                    ®  ÌÌÌÌ€dÊ87  Name3 ÌÌÌÌÌÌÌÌÌÌ               `Ê87  lastName3 ÌÌÌÌÌÌ                p`Ê87  Street3 ÌÌÌÌÌÌÌÌ               gÊ87  City3 ÌÌÌÌÌÌÌÌÌÌ              ðbÊ87  333 ÌÌÌÌÌÌÌÌÌÌÌÌ                    
  ÌÌÌÌ lÊ87  Name1 ÌÌÌÌÌÌÌÌÌÌ              à^Ê87  lastName1 ÌÌÌÌÌÌ                Ð_Ê87  Street1 ÌÌÌÌÌÌÌÌ              ÐdÊ87  City1 ÌÌÌÌÌÌÌÌÌÌ               bÊ87  111 ÌÌÌÌÌÌÌÌÌÌÌÌ                    W  ÌÌÌÌ`kÊ87  Name2 ÌÌÌÌÌÌÌÌÌÌ              fÊ87  lastName2 ÌÌÌÌÌÌ                 €iÊ87  Street2 ÌÌÌÌÌÌÌÌ              PbÊ87  City2 ÌÌÌÌÌÌÌÌÌÌ              ÐiÊ87  222 ÌÌÌÌÌÌÌÌÌÌÌÌ                    ®  ÌÌÌÌ€dÊ87  Name3 ÌÌÌÌÌÌÌÌÌÌ               `Ê87  lastName3 ÌÌÌÌÌÌ                p`Ê87  Street3 ÌÌÌÌÌÌÌÌ               gÊ87  City3 ÌÌÌÌÌÌÌÌÌÌ              ðbÊ87  333 ÌÌÌÌÌÌÌÌÌÌÌÌ                    
  ÌÌÌÌ lÊ87  Name1 ÌÌÌÌÌÌÌÌÌÌ              à^Ê87  lastName1 ÌÌÌÌÌÌ                Ð_Ê87  Street1 ÌÌÌÌÌÌÌÌ              ÐdÊ87  City1 ÌÌÌÌÌÌÌÌÌÌ               bÊ87  111 ÌÌÌÌÌÌÌÌÌÌÌÌ                    W  ÌÌÌÌ`kÊ87  Name2 ÌÌÌÌÌÌÌÌÌÌ              fÊ87  lastName2 ÌÌÌÌÌÌ                 €iÊ87  Street2 ÌÌÌÌÌÌÌÌ              PbÊ87  City2 ÌÌÌÌÌÌÌÌÌÌ              ÐiÊ87  222 ÌÌÌÌÌÌÌÌÌÌÌÌ                    ®  ÌÌÌÌ€dÊ87  Name3 ÌÌÌÌÌÌÌÌÌÌ               `Ê87  lastName3 ÌÌÌÌÌÌ                p`Ê87  Street3 ÌÌÌÌÌÌÌÌ               gÊ87  City3 ÌÌÌÌÌÌÌÌÌÌ              ðbÊ87  333 ÌÌÌÌÌÌÌÌÌÌÌÌ                    
  ÌÌÌÌ

EDIT: As requested here is a header for Person.h

#pragma once
#ifndef PERSON_H
#define PERSON_H
#include <string.h>
#include "Address.h"
#include "Date.h"
#include <fstream>

struct Person {
public:
    Person(std::string name, std::string last_name, Address _residence, Date birthDate);
    Person();
    friend std::ostream& operator<<(std::ostream& os, const Person& p);
    friend std::istream& operator>>(std::istream& is, Person& p);
    std::string getName() const { return _name; }
    std::string getLastName() const { return _lastName; };
    Address getResidence() const { return _residence; };
    Date getDate() const { return _birthDate; };
    void read(std::ifstream *in);
    void write(std::ofstream *out);
private:
    std::string _name;
    std::string _lastName;
    Address _residence;
    Date _birthDate;
};
#endif // !PERSON_H
Antrophy
  • 29
  • 6
  • Character `A` is represented internally by its ASCII code 65. Writing a string into a binary file, the bytes of the string will be written. Looking with a text editor into this binary file, the bytes will be interpreted as characters again i.e. the byte with value 65 becomes an `A` again. Whatever you thought binary is, it's just binary but nothing encrypted. Have a look into an `.exe` file. Even there, you will find some portions of readable text (probably). – Scheff's Cat Nov 07 '19 at 15:40
  • 1
    Btw. `out->write(_name.c_str(), _name.size());` does cover all characters but not where the string ends. Either write `_name.size() + 1` to capture the 0 terminator as well or prefix the string with writing its length (in binary). – Scheff's Cat Nov 07 '19 at 15:43
  • 1
    Binary file does not means "encrypted". Btw, could you also provide the declaration of the Person class. I suspect that your write((char*)people, sizeof(people)) does not write any string to the file but only pointer to them – Gojita Nov 07 '19 at 15:45
  • You will find a hard time figuring out how to read the data back in a coherent manner, given that you have not indicated where each string begins and ends in the file. – PaulMcKenzie Nov 07 '19 at 15:45
  • @Scheff Edited my question with the output files both for the individual variables and the whole structure. I feel like the first one which I'm attempting should not look like this, or am I wrong? The second one also has variables written normally. – Antrophy Nov 07 '19 at 15:47
  • 1
    @Antrophy *Whereas if I write the entire structure into the binary file* -- Let's see the structure. As the previous comment stated, if `Person` is not trivially copyable, then writing to a binary file shouldn't be done. – PaulMcKenzie Nov 07 '19 at 15:48
  • You should provide the declaration of `Person` as well. How to serialize things correctly mainly depends of the type of each thing. – Scheff's Cat Nov 07 '19 at 15:48
  • @Scheff added the declaration of `Person`. – Antrophy Nov 07 '19 at 15:49
  • I'm afraid you overestimated the meaning of `std::ios::binary`. (Not that you don't need it.) Please, have a look onto this: [Binary and text modes](https://en.cppreference.com/w/cpp/io/c#Binary_and_text_modes) – Scheff's Cat Nov 07 '19 at 15:49
  • 1
    @Antrophy -- Well that settles it -- you can't write `Person` to a binary file like that. You class is not POD or trivially copyable. Also think about it logically -- you probably used `sizeof(Person)` in your writing code. If those strings contains a million characters, the `sizeof(Person)` is not going to be several million bytes. – PaulMcKenzie Nov 07 '19 at 15:50
  • @Antrophy [See this example](https://coliru.stacked-crooked.com/a/c69b26eac352ce50). The `sizeof(Person)` is 72, regardless that `name` contains a one million character string of stars. So using `sizeof(Person)` to specify the byte count when writing wasn't going to work. – PaulMcKenzie Nov 07 '19 at 16:02
  • @PaulMcKenzie I see, so basically the only option to write this struct into a binary file is to use the method that I custom created as shown in the question. And since it is readable by my doesn't mean anything and it is indeed a binary format, correct? – Antrophy Nov 07 '19 at 16:07
  • It would be better to use a library that handles that work in a consistent manner, like boost or google protobuf. But you have the basic idea correct -- you would need to serialize each member in some way where you can recreate the `Person` object from the file when you read the file, – PaulMcKenzie Nov 07 '19 at 16:17

1 Answers1

1

A sample to serialize a std::string (with a limited length of ≤ 65536 characters):

#include <cassert>
#include <iostream>
#include <fstream>

void writeString(std::ostream &out, const std::string &str)
{
  // write length of string (two bytes, little endian)
  assert(str.size() < 1 << 16);
  const size_t size = str.size();
  char buffer[2] = { (char)(size & 0xff), (char)(size >> 8 & 0xff) };
  out.write(buffer, sizeof buffer)
  // write string contents
  && out.write(str.c_str(), size);
}

void readString(std::istream &in, std::string &str)
{
  // read length
  char buffer[2];
  if (!in.read(buffer, 2)) return; // failed
  const size_t size = (unsigned char)buffer[0] | (unsigned char)buffer[1] << 8;
  // allocate size
  str.resize(size);
  // read contents
  in.read(&str[0], size);
}

int main()
{
  // sample
  std::string name = "Antrophy";
  // write binary file
  { std::ofstream out("test.dat", std::ios::binary);
    writeString(out, name);
  } // closes file
  // reset sample
  name = "";
  // read binary file
  { std::ifstream in("test.dat", std::ios::binary);
    readString(in, name);
  } // closes file
  // report result
  std::cout << "name: '" << name << "'\n";
}

Output:

name: 'Antrophy'

The hexdump of test.dat:

00000000  08 00 41 6e 74 72 6f 70  68 79                    |..Antrophy|
0000000a

Live Demo on coliru

Note:

Consider how the length (limited to 16 bit) is written. This can be done similar to serialize integral values.


A (IMHO) good introduction is provided by the C++FAQ:

Serialization and Unserialization


An extended sample for binary I/O of a composed type Person:

#include <cassert>
#include <iostream>
#include <fstream>

template <size_t nBytes, typename VALUE>
std::ostream& writeInt(std::ostream &out, VALUE value)
{
  const size_t size = sizeof value;
  char buffer[nBytes];
  const size_t n = std::min(nBytes, size);
  for (size_t i = 0; i < n; ++i) {
    buffer[i] = (char)(value >> 8 * i & 0xff);
  }
  for (size_t i = size; i < nBytes; ++i) buffer[i] = '\0';
  return out.write(buffer, nBytes);
}

template <size_t nBytes, typename VALUE>
std::istream& readInt(std::istream &in, VALUE &value)
{
  const size_t size = sizeof value;
  char buffer[nBytes];
  if (in.read(buffer, nBytes)) {
    value = (VALUE)0;
    const size_t n = std::min(nBytes, size);
    for (size_t i = 0; i < n; ++i) {
      value |= (VALUE)(unsigned char)buffer[i] << 8 * i;
    }
  }
  return in;
}

void writeString(std::ostream &out, const std::string &str)
{
  // write length of string (two bytes, little endian)
  assert(str.size() < 1 << 16);
  const size_t size = str.size();
  writeInt<2>(out, size)
  // write string contents
  && out.write(str.c_str(), size);
}

void readString(std::istream &in, std::string &str)
{
  // read length
  std::uint16_t size = 0;
  if (!readInt<2>(in, size)) return; // failed
  // allocate size
  str.resize(size);
  // read contents
  in.read(&str[0], size);
}

struct Person {
  std::string lastName, firstName;
  int age;
  
  void write(std::ostream&) const;
  void read(std::istream&);
};

void Person::write(std::ostream &out) const
{
  writeString(out, lastName);
  writeString(out, firstName);
  writeInt<2>(out, age);
}

void Person::read(std::istream &in)
{
  readString(in, lastName);
  readString(in, firstName);
  std::int16_t age; assert(sizeof age == 2); // ensure proper sign extension
  if (readInt<2>(in, age)) this->age = age;
}

int main()
{
  // sample
  Person people[2] = {
    { "Mustermann", "Klaus", 23 },
    { "Doe", "John", -111 }
  };
  // write binary file
  { std::ofstream out("test.dat", std::ios::binary);
    for (const Person &person : people) person.write(out);
  } // closes file
  // read sample
  Person peopleIn[2] = {
    { "", "", -1 },
    { "", "", -1 }
  };
  // read binary file
  { std::ifstream in("test.dat", std::ios::binary);
    for (Person &person : peopleIn) person.read(in);
  } // closes file
  // report result
  int i = 1;
  for (const Person &person : peopleIn) {
    std::cout << "person " << i++ << ": '"
      << person.firstName << ' ' << person.lastName
      << ", age: " << person.age << '\n';
  }
}

Output:

person 1: 'Klaus Mustermann, age: 23
person 2: 'John Doe, age: -111

The hexdump of test.dat:

00000000  0a 00 4d 75 73 74 65 72  6d 61 6e 6e 05 00 4b 6c  |..Mustermann..Kl|
00000010  61 75 73 17 00 03 00 44  6f 65 04 00 4a 6f 68 6e  |aus....Doe..John|
00000020  91 ff                                             |..|
00000022

Live Demo on coliru

Note:

The kind of binary I/O for integral values (readInt() and writeInt()) might look overcomplicated compared to the simple out.write((char*)value, sizeof value); found elsewhere. I did it in a more portable way which will even work when used on different platforms with distinct endianess and/or distinct size of integrals.

Scheff's Cat
  • 19,528
  • 6
  • 28
  • 56