-3

I created a Python script which creates the following map (illustration):

map<uint32_t, string> tempMap = {{2,"xx"}, {200, "yy"}};

and saved it as map.out file (a binary file). When I try to read the binary file from C++, it doesn't copy the map, why?

    map<uint32_t, string> tempMap;
    ifstream readFile;
    std::streamsize length;

    readFile.open("somePath\\map.out", ios::binary | ios::in);
    if (readFile)
    {
        readFile.ignore( std::numeric_limits<std::streamsize>::max() );
        length = readFile.gcount();
        readFile.clear();   //  Since ignore will have set eof.
        readFile.seekg( 0, std::ios_base::beg );
        readFile.read((char *)&tempMap,length);
        for(auto &it: tempMap)
        {
          /* cout<<("%u, %s",it.first, it.second.c_str()); ->Prints map*/
        }
    }
    readFile.close();
    readFile.clear();
1201ProgramAlarm
  • 32,384
  • 7
  • 42
  • 56
Sababoni
  • 4,875
  • 4
  • 20
  • 21
  • 1
    Serialization (use this keyword for research) is just not as simple as that. BTW: C-style casts are almost always a cause of problems, like e.g. in your case. https://stackoverflow.com/questions/69920134/the-output-of-the-code-in-the-text-file-is-weird-characters-and-it-should-be-nam#comment123597841_69920134 is a recent question by someone that's effectively facing the same issue. – Ulrich Eckhardt Nov 11 '21 at 17:42
  • Did you look to https://msgpack.org/index.html ? – mpromonet Nov 27 '21 at 10:53

2 Answers2

5

It's not possible to read in raw bytes and have it construct a map (or most containers, for that matter)[1]; and so you will have to write some code to perform proper serialization instead.

If the data being stored/loaded is simple, as per your example, then you can easily devise a scheme for how this might be serialized, and then write the code to load it. For example, a simple plaintext mapping can be established by writing the file with each member after a newline:

<number>
<string>
...

So for your example of:

std::map<std::uint32_t, std::string> tempMap = {{2,"xx"}, {200, "yy"}};

this could be encoded as:

2
xx
200
yy

In which case the code to deserialize this would simply read each value 1-by-1 and reconstruct the map:

// Note: untested
auto loadMap(const std::filesystem::path& path) -> std::map<std::uint32_t, std::string>
{
  auto result = std::map<std::uint32_t, std::string>{};
  auto file = std::ifstream{path};
  
  while (true) {
    auto key = std::uint32_t{};
    auto value = std::string{};
    
    if (!(file >> key)) { break; }
    if (!std::getline(file, value)) { break; }

    result[key] = std::move(value);
  }
  return result;
}

Note: For this to work, you need your python program to output the format that will be read from your C++ program.

If the data you are trying to read/write is sufficiently complicated, you may look into different serialization interchange formats. Since you're working between python and C++, you'll need to look into libraries that support both. For a list of recommendations, see the answers to Cross-platform and language (de)serialization


[1] The reason you can't just read (or write) the whole container as bytes and have it work is because data in containers isn't stored inline. Writing the raw bytes out won't produce something like 2 xx\n200 yy\n automatically for you. Instead, you'll be writing the raw addresses of pointers to indirect data structures such as the map's internal node objects.

For example, a hypothetical map implementation might contain a node like:

template <typename Key, typename Value>
struct map_node
{
  Key key;
  Value value;
  map_node* left;
  map_node* right;
};

(The real map implementation is much more complicated than this, but this is a simplified representation)

If map<Key,Value> contains a map_node<Key,Value> member, then writing this out in binary will write the binary representation of key, value, left, and right -- the latter of which are pointers. The same is true with any container that uses indirection of any kind; the addresses will fundamentally differ between the time they are written and read, since they depend on the state of the program at any given time.

You can write a simple map_node to test this, and just print out the bytes to see what it produces; the pointer will be serialized as well. Behaviorally, this is the exact same as what you are trying to do with a map and reading from a binary file. See the below example which includes different addresses.

Live Example

Human-Compiler
  • 11,022
  • 1
  • 32
  • 59
-1

You can use protocol buffers to serialize your map in python and deserialization can be performed in C++.

Protocol buffers supports both Python and C++.

Arun KG
  • 9
  • 4