2

In the stxxl FAQ, I found this:

Parameterizing STXXL Containers

STXXL container types like stxxl::vector can be parameterized only with a value type that is a POD (i.e. no virtual functions, no user-defined copy assignment/destructor, etc.) and does not contain references (including pointers) to internal memory. Usually, "complex" data types do not satisfy this requirements.

This is why stxxl::vector<std::vector<T> > and stxxl::vector<stxxl::vector<T> > are invalid. If appropriate, use std::vector<stxxl::vector<T> >, or emulate a two-dimensional array by doing index calculation.

The inability to use stxxl::vector<std::vector<T> > makes perfects sense, as stxxl containers do not call constructors or destructors of the contained elements upon container resize. But what about storing a struct like this:

struct S {
    int* a;
}

If I do guarantee that the object pointed by a is valid as long as the stxxl::vector<S> instance exists, what's a problem in storing this struct into a stxxl::vector<S>? If a particular instance of S has to be moved to disk, then the value of the a pointer is written on disk. Later on, the pointer value is restored and I can use it. Obviously the pointer value is machine-dependent and instance-dependent too, but is this a matter even if I take care of the lifetime of the pointed object? I am not sending a serialized object via a socket, and I am not storing a serialized object in a database for later use.

Am I missing something?

EDIT: someone reminded me that stxxl does not copy the pointee, and therefore I may get a pointer-to-garbage when I retrieve an instance of struct S later on. I know that. I'll guarantee that the pointee is valid throughout the full lifetime of the program.

Jonas
  • 121,568
  • 97
  • 310
  • 388
gd1
  • 11,300
  • 7
  • 49
  • 88
  • My guess would be that the data are stored somewhere outside of the virtual memory address space in which these pointers make sense. – juanchopanza Feb 24 '14 at 14:24
  • In that case, these pointers make sense again when copied back into the context in which they were 'born'. Isn't a pointer just an integer? Please note that `struct S` does not care about the creation/destruction of the object pointed by `*a` – gd1 Feb 24 '14 at 14:25
  • The containers must be designed for value semantics, like their standard library counterparts. So any pointers to internal data must point to memory managed by the containers themselves, not to some external entities. – juanchopanza Feb 24 '14 at 14:27
  • Is that an abstract principle or does it translate into an actual behaviour? i.e., if I store a pointer, say `0x673f45e6`, in `*a`, when I retrieve that `S` instance will I get `0x673f45e6` (regardless what it points to, being garbage or not) or `something else who knows what`? Thanks. – gd1 Feb 24 '14 at 14:33
  • 2
    I am not knowledgeable enough in STXXL (otherwise I would attempt to answer the question) but I imagine if you store a pointer, it keeps its value and you can de-reference it later as long as it points to a valid object. I would find it surprising if it did something else. – juanchopanza Feb 24 '14 at 14:36
  • 1
    I think (not familiar with this lib) this is just a documentation issue: they tried to make it hard for a beginner to miss that restriction, without entering into the details of valid advanced uses. – Marc Glisse Feb 24 '14 at 14:47

3 Answers3

5

(including pointers) to internal memory

This means a pointer to a member of the struct, or otherwise a pointer into the memory that the container manages. E.g. you have

struct Foo {
     int *a;
     int b;
};

Foo f;
f.a = &f.b

Since f.a now points to a member of the struct, and that struct could be copied around, the pointer can be invalid. Similar, if the pointer points to any other struct Foo managed by the container- which could be moved around too.

If you just have a pointer, and manage what it points to, you should be fine.

nos
  • 223,662
  • 58
  • 417
  • 506
  • 1
    Thank you, nos. I thought *internal memory* just meant *main memory*, as the library authors usually refer to *external memory* as the non-main memory (i.e. the disk). I made a wrong assumption, then. – gd1 Feb 24 '14 at 14:46
  • 1
    This use of the term internal memory is normal in these contexts. +1 – Tony Delroy Feb 24 '14 at 14:54
1

On implementations with strict pointer safety, the fact that you've saved a pointer to disk is insufficient. If that pointer is no longer in memory, the object it pointed to is no longer valid - not even if you restore the pointer bits from disk. In particular, it may have been garbage collected without running any dtor.

MSalters
  • 173,980
  • 10
  • 155
  • 350
-1

I imagine it's because the data in the containers are copied in using a memcpy type approach - so if you had a pointer inside your class that you were storing, you'd copy the pointer and not the pointed-to data.

When you serialise such a structure, the pointed-to data will not be serialised, only the pointer. When you restore the data, you'll have a pointer than points to garbage.

gd1
  • 11,300
  • 7
  • 49
  • 88
gbjbaanb
  • 51,617
  • 12
  • 104
  • 148
  • 1
    Copying the pointer - and not the pointee - is exactly what I want to do. Whether the pointer points to garbage or not should be my business. I may assure you that I will protect the object pointed by `*a` at the cost of my life. What's a problem then? – gd1 Feb 24 '14 at 14:29
  • @gd1 then go ahead and ignore their recommendation, put your pointers in there and see what happens. OF course, you'll want to comment that heavily so no-one else misunderstands this atypical behaviour. – gbjbaanb Feb 24 '14 at 14:37
  • The point is not what I am doing or what I will do. If they recommend to sing a song while writing code based on their library, I'll do it. Still, I am one of those persons who love to know the *why*'s of the things. If you know the reason, please contribute. Otherwise there is no problem at all in leaving the question unanswered saving both of us lots of time. – gd1 Feb 24 '14 at 14:38
  • @gd1 I am telling you the why. Its a beginner's mistake when learning C that a pointer and its data are not the same thing. This library is telling you the same thing, but in a roundabout way. Its also why they say not to use complex types either. Debug into their copy constructor and you'll see. – gbjbaanb Feb 24 '14 at 14:56