1

I'm writing a logger in C++, and I've come to the part where I'd like to take a log record and write in to a file.

I have created a LogRecord struct, and would like to serialize it and write it to a file in binary mode.

I have read some posts about serialization in C++, and one of the answers included this following snippet:

reinterpret_cast<char*>(&logRec)

I've tried reading about reinterpret_cast and what it does, but I couldn't fully understand what's really happening in the background.

From what I understand, it takes a pointer to my struct, and turns it into a pointer to a char, so it thinks that the chunk of memory that holds my struct is actually a string, is that true? How can that work?

melpomene
  • 84,125
  • 8
  • 85
  • 148
Asaf
  • 2,005
  • 7
  • 37
  • 59
  • 1
    "Unlike static_cast, but like const_cast, the reinterpret_cast expression does not compile to any CPU instructions. It is purely a compiler directive which instructs the compiler to treat the sequence of bits (object representation) of expression as if it had the type new_type." source: http://en.cppreference.com/w/cpp/language/reinterpret_cast What happens next is up to your code - treat `reinterpret_cast` as a big caution flag when reviewing code. – Richard Critten Jun 10 '16 at 15:03
  • 2
    Not "is actually a string" but "is actually a pointer to `char`". Strings is just one of the uses of `char*`. – molbdnilo Jun 10 '16 at 15:05
  • Possible duplicate of [reinterpret\_cast](http://stackoverflow.com/questions/4748232/reinterpret-cast) – Leon Jun 10 '16 at 15:10
  • For maintainability purpose, it is best to hide such low level implementation details. You also have to be very careful about alignment and the possibility that in future, you might want to add fields to that structure. In some case, text based serialization might be preferable... – Phil1970 Jun 10 '16 at 15:22
  • Writing whole struct records in binary is not necessarily portable even between different versions of the same compiler or even different compilations with the same version of the compiler using different compile flags. For log-files that have some duration I would recommend writing each field individually. Also, if you want platform compatibility, you may want to output your numeric fields in network byte order. – Galik Jun 10 '16 at 15:25
  • Bu the way, this is really terrible way to serialize your structure in most cases. – SergeyA Jun 10 '16 at 15:46
  • @Galik, thanks! what do you mean by writing each field individually? – Asaf Jun 10 '16 at 16:10
  • @SergeyA, how would you suggest to do that? – Asaf Jun 10 '16 at 16:10
  • @Asaf I meant to write each struct member variable individually. – Galik Jun 10 '16 at 16:33
  • @Galik, hmm interesting, but how does that solve the portability issue? thanks! – Asaf Jun 10 '16 at 18:20
  • @Galik, found my answer here :) thanks so much! http://stackoverflow.com/questions/17892448/reading-writing-files-to-from-a-struct-class – Asaf Jun 10 '16 at 18:34

4 Answers4

9

A memory address is just a memory address. Memory isn't inherently special - it's just a huge array of bytes, for all we care. What gives memory its meaning is what we do with it, and the lenses through which we view it.

A pointer to a struct is just an integer that specifies some offset into memory - surely you can treat one integer in any way you want, in your case, as a pointer to some arbitrary number of bytes (chars).

reinterpret_cast() doesn't do anything special except allow you to convert one view of a memory address into another view of a memory address. It's still up to you to treat that memory address correctly.

For instance, char* is the conventional way to refer to a string of characters in C++ - but the type char* literally means "a pointer to a single char". How does it come to mean a pointer to a null-terminated string of characters? By convention, that's how. We treat the type differently depending on the context, but it's up to us to make sure we do so correctly.

For instance, how do you know how many bytes to read through your char* pointer to your struct? The type itself gives you zero information - it's up to you to know that you've really got a byte-oriented pointer to a struct of fixed length.

Remember, under the hood, the machine has no types. A piece of paper doesn't care if you write an essay on each line, or if you scribble all over the thing. It's how we treat it - and how the tools we use (C++) treat it.

antiduh
  • 11,853
  • 4
  • 43
  • 66
  • 1
    Wow, @antiduh, thank you! Your answer really helped me understand what I'm doing by coding. It seems so logical and probably a given for most, but I didn't think of it that way until now! – Asaf Jun 10 '16 at 15:16
3

Binary-wise, it does nothing at all. This casting is a higher-level concept that has no bearing in any actual machine instructions.

At a low level, a pointer is just a numeric value that holds a memory address. There is nothing to be done in telling the compiler "although you thought the destination memory contained a struct, now please think that it contains a char". The actual address itself doesn't change in any way.

Smeeheey
  • 9,906
  • 23
  • 39
1

From what I understand, it takes a pointer to my struct, and turns it into a pointer to a char, so it thinks that the chunk of memory that holds my struct is actually a string, is that true?

Yes.

How can that work?

A string is just a sequence of bytes, and your object is just a sequence of bytes, so that's how it works.

But it won't if your object is logically more than just a sequence of bytes. Any indirection, and you're hosed. Furthermore, any implementation-defined padding or representation/endianness and your data is non-portable. This might be acceptable; it really depends on your requirements.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
0

Casting a struct into an array of bytes (chars) is a classic low impact method of binary serialization. This is based on the assumption that the content of the struct exists contiguously in memory. The casting allows us write this data to a file or socket using the normal APIs.

This only works though if the data is contiguous. This is true for C style structs or PODs in C++ terminology. It will not work with complex C++ objects or any struct with pointers to storage outside the struct. For text data you will need to use fixed size character arrays.

struct {
    int num;
    char name[50];
};

will serialize correctly.

struct {
    int num;
    char* name;
};

will not serialize correctly since the data for the string is stored outside the struct;

If you are sending data across a nework you will also need to ensure that the struct is packed or at least of known alignment and that integers are converted to a consistent endianness (network byte order is normally big endian)

doron
  • 27,972
  • 12
  • 65
  • 103