In C++, how "heavy" is a data structure when compared to a data structure pointer?

Question

Lets take a concrete example:

I want to use a vector of strings and have the choice of defining that vector as one of the following:

std::vector<string> myVector;

or

std::vector<string>* myPtrVector = new vector<string>;

I understand that myPtrVector is a memory address, but what is myVector? When I call myVector.push_back(someString), does it also use the memory address?

To consolidate my questions:

1). I would like to know how the computer treats normal (non-pointer) variables and how they are different from pointer variables.

2). I would also like to know if there is any advantage to declaring data structures (vectors, maps, stacks, ect) as pointers.

Thank y'all in advance for answering! I am new here and if you have any advice for how I should have asked my question differently I would be happy to hear it.

`std::vector* myPtrVector = new vector;` -- Who cleans up the memory? There is no automatic garbage collector in C++ for this. — PaulMcKenzie, Jun 15 '21 at 14:10
Does this answer your question? [When vectors are allocated, do they use memory on the heap or the stack?](https://stackoverflow.com/questions/8036474/when-vectors-are-allocated-do-they-use-memory-on-the-heap-or-the-stack) — Jeffrey, Jun 15 '21 at 14:11
The STL container classes aren't really very 'heavy' in themselves. They essentially have a pointer to their actual data, along with the required control/status variables and flags. — Adrian Mole, Jun 15 '21 at 14:12
Very closely related: [How is vector implemented in C++](https://stackoverflow.com/q/3064559/10077) — Fred Larson, Jun 15 '21 at 14:13
@AdrianMole but making a copy of a pointer will be much easier than making a copy of the entire container. The only question you need to answer is how often you'll need to do that. — Mark Ransom, Jun 15 '21 at 14:13
`std::vector` is basically just 3 pointers, or a pointer and 2 integers. There is generally no good reason to use `std::vector* myPtrVector = new vector;`. I would always flag such an expression as an error, unless the reason for it is clearly and convincingly documented. — François Andrieux, Jun 15 '21 at 14:13
*In C++, how “heavy” is a data structure when compared to a data structure pointer?* The data structure is lighter than the data structure pointer. Because the data structure pointer needs to be managed, and that management is federated, and has a much higher chance of introducing management related bugs. — Eljay, Jun 15 '21 at 14:14
Also see: https://stackoverflow.com/questions/55478523/how-does-stdvector-support-contiguous-memory-for-custom-objects-of-unknown-siz — NathanOliver, Jun 15 '21 at 14:15
@OP `std::vector* myPtrVector = new vector;` -- On most good code reviews, you will be asked "why are you doing this?", and you better have a good answer. — PaulMcKenzie, Jun 15 '21 at 14:23
@PaulMcKenzie and "That's how we do it in Java and C#" is *not* a good answer. — Mark Ransom, Jun 15 '21 at 14:55

score 0 · Answer 1 · answered Jun 15 '21 at 14:44

myVector is a local variable. Depending on the context of its use it may be stored in different places but the important thing is that it is automagically destroyed when its scope ends (e.g. if function local, when the function returns). Internally vector allocates from the free store, meaning when you push_back something it adds it to non-local memory (and allocates more if needed). Internally it will use a pointer to get at that memory, but you don't need to worry about that because it guarentees that it cleans up after itself.

As a final note, accessing a vector through a pointer is almost certainly slower than just moving it around or similar. Accessing through a pointer means you'll likely get a cache miss and you have to deal with managing the lifetime of it as well

score 0 · Answer 2 · answered Jun 15 '21 at 18:05

I will use example with this simple structure. We will asume that size of int is 4 bytes - this structure will have 8 bytes in size.

struct Point {
  int x;
  int y;

  Point(char x, int y){
    this->x = x;
    this->y = y;
  }
}

Now we will create 2 variables inside our nice function. One will be normal variable , another will be pointer

void niceFunction(){
    Point  normalVariable(10, 5);
    Point* pointer = new Point(10, 5);
}
niceFunction(); // we call our function

What is difference ? First variable contains our whole structure. If you use sizeof(normalVariable), you will get 8 bytes as result.

Second contains address to our structure. When you use sizeof(pointer) you will get 4 bytes - our structure is somewhere in dynamic memory and this pointer only contain address to our structure.

Allocation

Allocation is act when we request memory for our structure. System must find free memory and give it to us. We will fill this memory with our stuff (in our cases , we use numbers 10 and 5).

In first case (normal variable) , allocation is faster , because system use stack for normal variables and thus knows where your stuff is stored and where is free area. This has one problem - stack has limited size and if you allocate to many structures in it , memory will be depleted and your program will be terminated by os - error called Stack Overflow

In second case (pointer), system must find free area in dynamic memory and inject your structure there. This takes some time. Also you have unlimited memory (you are limited only by size of your ram memory and restrictions that are from operating system)

Dealocation

Dealocation is act of returning memory to system. Returned memory will be resued again when needed

In first case , dealocation is automatical. Because local variables exist only inside of function , there isn't any reason to dealocate them from stack once our function ends.

In second case (pointer), our structure is somewhere in memory and thus system cannot know how to dealocate it. We must explicitly state that we want to dealocate our structure

void niceFunction(){
    Point  normalVariable(10, 5);
    Point* pointer = new Point(10, 5);

    delete pointer; // this will dealocate our structure
}

If we forget to dealocate structure in dynamic memory , it will stay here until end of the program.

Argument passing / Copying

Imagine function that will have 2 parameters (point a and b) and it will return you their distance. We will use normal variables in first and pointer in second.

float distanceNormal(Point a, Point b){
  // body of function is irelevant for us
}
float distancePointer(Point* a, Point* b){
  // body of function is irelevant for us
}

In first case , when we pass our points into function , system will copy them. That means it will allocate new memory in stack inside function and it will copy your whole structure into that allocated memory. This can be inefective if your structures are massive or if their constructor is doing expensive operations. This also means that two normal variables cannot share one structure - they can have same values , but they are not same structure.

In second case, nothing is copied. Instead, system will only pass address to your structure in dynamic memory. This also lead to one effect - when you modify structure using pointer. It will change original structure , because they are both have same address

Point* pointer = new Point(10, 5); // values of our point are (10, 5)

void changeX(Point* point){
    point->x = 50;
}
changeX(pointer); // after this call, values of our point are (50, 5)

This is exploited by methods of structures/objects. Do you see that weird variable called this in constructor? That is also pointer* - it is pointer to our structure. This allows us manipulate values of structure directly.

Conclusion

You can see that both of them have advantages and disantvantages. When to use them ?

Use normal variables when:

you don't use its content outside of function. When you have something that is used only inside of function , there isn't any reason to allocate it in dynamic memory. Stack is better option for this
you want automatic dealocation: in our case , normal variable will be dealocated automaticaly. Second one(pointer) must be dealocated manualy.
you don't want share it into function: When you use pointer, you are working with original and changes will be aplied to original. When you use normal variable, you are working with copy and original structure will be unaffected

Use pointers when:

you want to work with original: if you want to manipulate original structure inside function, use pointer.
you want to avoid copying: if you have structure and you have function that is called 1000 times, it is ineffective to use normal variable. Use pointer to avoid copying.
you want to avoid automatic allocation: sometimes , automatic allocation isn't good. Like you allocated structure for your game that has hundreds of bytes and after end of function , it will be dealocated. This never happend with pointers , you must dealocate them manualy using delete.

Golden rule is: Use normal variables where you can, use pointers where you must

This was VERY informative! Thank you so much. – Ryan10937 Jun 16 '21 at 15:36 — Ryan10937, Jun 16 '21 at 15:36

In C++, how "heavy" is a data structure when compared to a data structure pointer?

2 Answers2

Allocation

Dealocation

Argument passing / Copying

Conclusion