10

I want to write an object into a sequential file using fwrite. The Class is like

class A{
    int a;
    int b;
public:
    //interface
}

and when I write an object into a file. I am wandering that could I use fwrite( this, sizeof(int), 2, fo) to write the first two integers.

Question is: is this guaranteed to point to the start of the object data even if there may have a virtual table exist in the very beginning of the object. So the operation above is safe.

Joey.Z
  • 4,492
  • 4
  • 38
  • 63
  • Why don't you simply try? Publish a and b, add a virtual method and then check address of A::a. – Spook May 29 '13 at 07:00
  • @Spook I guess he tried on a POD class, and it worked :) – BЈовић May 29 '13 at 07:00
  • 8
    @Spook I think he wants to know what the ISO standard says. At least I do. – Edward A May 29 '13 at 07:02
  • 5
    @EdwardA the standard makes little promises in respect to memory layout - basically you only know that same-access-level members are in order & grouped (or smth like that), but that's about it. – Luchian Grigore May 29 '13 at 07:04
  • 2
    @LuchianGrigore and that's the answer to the question: "little promises" = *no gurarantees* – Arne Mertz May 29 '13 at 07:16
  • @Spook I tried in my code and success, the class currently have only some int and pointers to the extra data, and no virtual functions. But since the class may be extended in the future, there may be some virtual functions and I don't try that. – Joey.Z May 29 '13 at 07:25
  • @BЈовић And, I want to know the terminology 'POD', what is this referring to. – Joey.Z May 29 '13 at 07:28
  • 1
    @zoujyjs For what the Standard guarantees, and for how that relates to the meaning of 'POD', please see my answer. – jogojapan May 29 '13 at 07:33

4 Answers4

5

No it's not. You could use fwrite(&a, sizeof(int), 2, fo), but you shouldn't either. Just strolling over raw memory is seldom a good idea when it comes to safety, because you should not rely on specific memory layouts. Someone could introduce another variable c between a and b, without noticing that he's breaking your code. If you want to access your variables, do that explicitly. Don't just access the memory where you think the variables are or where they once were the last time you checked.

Arne Mertz
  • 24,171
  • 3
  • 51
  • 90
  • So the simplest and a safe way to do this is write the members independently? Like `fwrite(&a...); fwrite(&b)` etc. Is that correct? – Joey.Z May 29 '13 at 07:39
  • if it really *has* to be fwrite, yes. But you really should consider using some serialization library. The files you write with fwrite are not portable. – Arne Mertz May 29 '13 at 07:41
  • Platform ABIs define fixed structure layouts for data interchange. Having standard-layout types, and POD types, is essential for interfacing with an OS and shared libraries. So while you don't need to _stroll over raw memory_ you certainly need guarantees about layout, and where `this` points. – edA-qa mort-ora-y May 29 '13 at 09:23
  • @edA-qamort-ora-y well yes, there are uses cases in low level code where direct raw memory access is needed and useful. But as you say, those uses and the layout guarantees are resticted to standard layout types. Those cases should be well encapsulted and thoroughly documented. However, the OP's case (writing to a file from a possibly non-standard-layout object) does not qualify for such raw memory access. – Arne Mertz May 29 '13 at 09:34
5

this provides the address of the object, which is not necessarily the address of the first member. The only exception are so-called standard-layout types. From the C++11 Standard:

(9.2/20) A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note: There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning, as necessary to achieve appropriate alignment. — end note ]

This is the definition of a standard-layout type:

(9/7) A standard-layout class is a class that:
— has no non-static data members of type non-standard-layout class (or array of such types) or reference,
— has no virtual functions (10.3) and no virtual base classes (10.1),
— has the same access control (Clause 11) for all non-static data members,
— has no non-standard-layout base classes,
— either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and
— has no base classes of the same type as the first non-static data member.[108]

[108] This ensures that two subobjects that have the same class type and that belong to the same most derived object are not allocated at the same address (5.10).

Note that the object type does not have to be a POD – having standard-layout as defined above is sufficient. (PODs all have standard-layout, but in addition, they are trivially constructible, trivially movable and trivially copyable.)

As far as I can tell from your code, your type seems to be standard-layout (make sure access control is the same for all non-static data members). In this case, this will indeed point to the initial member. Regarding using this for the purposes of serialization, the Standard actually says explicitly:

(9/9) [ Note: Standard-layout classes are useful for communicating with code written in other programming languages. Their layout is specified in 9.2. — end note ]

Of course this does not solve all problems of serialization. In particular, you won't get portability of the serialized data (e.g. because of endianness incompatibility).

Community
  • 1
  • 1
jogojapan
  • 68,383
  • 11
  • 101
  • 131
  • I see, so the POD is referring to something like the struct in C, no need of constructor, having bitwise copy semantics. – Joey.Z May 29 '13 at 07:44
  • @zoujyjs Yes, that is a pretty good characterization (see also [this question](http://stackoverflow.com/questions/146452/what-are-pod-types-in-c)). But again, to be able to use `this` to refer to the initial data member, you don't need a POD. Standard-layout is sufficient. – jogojapan May 29 '13 at 07:45
1

Many answers have correctly said "No". Here is some code that demonstrates why the this is never guaranteed to point to the start of the object:

#include <iostream>

class A {
    public: virtual int value1() { std::cout << this << "\n"; }
};

class B {
    public: virtual int value2() { std::cout << this << "\n"; }
};

class C : public A, public B {};

int main(int argc, char** argv) {
    C* c = new C();
    A* a = (A*) c;
    B* b = (B*) c;
    a->value1();
    b->value2();
    return 0;
}

Please note the use of this in the virtual methods.

The output can (depending on the compiler) show you that pointers a and b are different. Most likely, a will point to the start of the object, but b will not. The problem appears most easily when multiple inheritance is in use.

Kevin A. Naudé
  • 3,992
  • 19
  • 20
  • @LuchianGrigore That is true if you can guarantee that no casts have previously happened, i.e. you have the original pointer at the start. If you are receiving the pointer through a function interface, there is no such guarantee. – Kevin A. Naudé May 29 '13 at 07:12
  • @LuchianGrigore Ok, I altered the example code to show explicit use of `this`. I'm not arguing against other answers. I am showing that the link between `this` and the _start of the object_ is a very tenuous notion. My intention is to add useful information. – Kevin A. Naudé May 29 '13 at 07:19
  • Ah, I must admit I missed the `cout< – Luchian Grigore May 29 '13 at 07:26
  • @LuchianGrigore No problem, we're on the same page :) – Kevin A. Naudé May 29 '13 at 07:31
0

Writing an object to a file using fwrite is a very bad idea for many reasons. For example if your class contains an std::vector<int> you would be saving pointers to the integers, not the integers.

For "higher-level" reasons (alignment, versioning, binary compatibility) it's also a bad idea in most cases even in C and even when the members are just simple native types.

6502
  • 112,025
  • 15
  • 165
  • 265