1

I am coming from a C#/Java background into C++, using visual studio community 2017 & plenty of tutorials. I came to the point where am unsure of what is a correct way to write a function to process a vector of data. Should I force a function to use a pointer / reference? Should I let compiler sort it out? What is best practice?

This is my main, I ask for an input on vector size, then pass a pointer to the integer value to function that creates and populates vector with values through a simple for loop. I then pass the array to another function that performs a shuffle.

vector<int> intVector(int* count)
{
    vector<int> vi;
    for (int i = 1; i <= *count; i++)
        vi.push_back(i);
    return vi;
}

vector<int> &randVector(vector<int> *v)
{
    shuffle(v->begin(), v->end(), default_random_engine());
    return *v;
}

int _tmain(int argc, _TCHAR* argv[])
{
    int count;
    cout << "Enter vector array size: ";
    cin >> count; cout << endl;
    cout << "Vector of integers: " << endl;
    vector<int> vi = intVector(&count);

    for_each(vi.begin(), vi.end(), [](int i) {cout << i << " ";});
    cout << endl;
    vi = randVector(&vi);
    cout << "Randomized vector of integers: " << endl;
    for_each(vi.begin(), vi.end(), [](int i) {cout << i << " ";});
    cout << endl;

    return 0;
}

So my question is, what is the best practice in my case to avoid unnecessary copying. Should I even care about it? Should I rely on compiler to solve it for me?

I am planing to use C++ for game development on desktop and consoles. Understanding memory and performance management is important for me.

AleksanderNaumenok
  • 161
  • 1
  • 1
  • 7
  • 1
    Why `vector &randVector(vector *v)` instead of simple `void randVector(vector& v)`, and calling it by simply writing `randVector(vi);`? Don't use pointers where you don't need them. – Algirdas Preidžius Jul 10 '17 at 16:17
  • That is a very good point, I also thought of it. Since am passing a pointer to memory location and operation is performed on the original object and not a copy. The return statement indeed became redundant. However, I felt uncertain in regards to what is considered best practice in c++ and here I am. – AleksanderNaumenok Jul 10 '17 at 16:20
  • First of all, learn about smart pointers - `std::unique_ptr`, `std::shared_ptr` & `std::weak_ptr` as well as `std::make_unique` and `std:::make_shared` and try not to use manual memory management (raw pointers and `new`/`delete`). – Jesper Juhl Jul 10 '17 at 16:22
  • 4
    @JesperJuhl There’s no manual memory management in the code, and no smart pointers are needed. – Konrad Rudolph Jul 10 '17 at 16:22
  • @Konrad Rudolph I didn't say there was. I was just trying to give some general pointer related advice. – Jesper Juhl Jul 10 '17 at 16:24
  • Why do you type `int * count` instead of `int count`? Do you intend to modify `count`? Even if you want to modify, use reference when you can, use pointer only when you must. – raymai97 Jul 10 '17 at 16:25
  • I used `int* count` in order to avoid copying the value while passing to the function. I did not intend to modify it, so I could have written it as such instead `const int* count`. – AleksanderNaumenok Jul 10 '17 at 16:30
  • One thing to keep in mind is that pointers take memory too, and dereferencing a pointer (using the thing you are pointing at) takes a small amount of computation time. Your intVector(int* count) doesn't save any memory compared to intVector(int count), and is actually a little slower. – Darryl Jul 10 '17 at 16:38
  • @AleksanderNaumenok "I used int* count in order to avoid copying the value" - it's an `int` for crying out loud. You can hardly find anything cheaper to copy. Copying the pointer is likely to be as expensive or more expensive + you've now made the optimizers job much harder by adding a memory reference/load. Just pass by value. Keep it simple unless you have a reason not to. – Jesper Juhl Jul 10 '17 at 16:40
  • @Jesper Juhl haha, yes good point thank you. If I would pass a much larger object, say a Person object or a string containing 500+ letters. Would it make sense to pass it as a pointer or reference then? – AleksanderNaumenok Jul 10 '17 at 16:45
  • @AleksanderNaumenok yes, then you are passing into territory where passing a reference or pointer makes more sense. – Jesper Juhl Jul 10 '17 at 16:48
  • @Jesper Juhl in regards to your comment about smart pointers. I have a general understanding of them, but how would you suggest to use them for my current example? – AleksanderNaumenok Jul 10 '17 at 16:59
  • @AleksanderNaumenok I was commenting in terms of general advice. Not specific to your example. – Jesper Juhl Jul 10 '17 at 17:05

2 Answers2

1

You are in charge of enforcing (or avoiding) the copy of objects around.

Regarding your example:

  • You can avoid using pointers and use a reference instead.

Like in the following:

vector<int>& randVector(vector<int>& v)
{
    shuffle(v->begin(), v->end(), default_random_engine());
    return v;
}

Note that since you are using a reference, the shuffle operation is already modifying the parameter of randVector so there is no real need to return a reference to it.

As a rule of thumb when you need to pass an object around and you want to avoid a potentially expensive copy you can use references:

void function(<const> Object& v)
{
//   do_something_with_v
}
Davide Spataro
  • 7,319
  • 1
  • 24
  • 36
  • Your code sample will not compile (what's the meaning of `v->begin(), v->end()`, when `v` is **not** a pointer to a `vector`?). – Algirdas Preidžius Jul 10 '17 at 16:25
  • Thank you for your reply. One of the reasons I am asking about if compiler will solve it for me is because while looking for an answer online, I came across [Return value optimization] (https://en.wikipedia.org/wiki/Return_value_optimization) and become a bit "surprised", so to say. – AleksanderNaumenok Jul 10 '17 at 16:27
  • 1
    "The compiler will never do anything for you" - weeell, not *quite* true in general. In real life the optimizer collapses quite a few abstractions and does much more on top. Do you usually just ship unoptimized binaries to your customers? ;-) – Jesper Juhl Jul 10 '17 at 16:27
  • @AleksanderNaumenok If you read about it in greater detail - you would see that it wouldn't apply in your case. – Algirdas Preidžius Jul 10 '17 at 16:29
  • @Algirdas Preidžius thank you, I was afraid that would be the case. Do you have any good sources I could read up on in regards to this issue? – AleksanderNaumenok Jul 10 '17 at 16:33
  • @AleksanderNaumenok you can have a look at this answer: https://stackoverflow.com/questions/12953127/what-are-copy-elision-and-return-value-optimization/12953150 – Davide Spataro Jul 10 '17 at 16:34
  • @Davide Spataro : I think you'll enjoy this: [Understanding Compiler Optimizations](https://m.youtube.com/watch?v=UHv_Jog9Xuc&feature=youtu.be). – Jesper Juhl Jul 10 '17 at 16:35
0

The rules on passing in C++ for typical code are pretty straightforward (though obviously still more complex than languages without references/pointers).

  • In general, prefer references to pointers, unless passing in null is actually something you might do
  • Prefer to write functions that don't mutate their inputs, and return an output by value
  • Inputs should be passed by const reference, unless it is a primitive type like an integer, which should be passed by value
  • If you need to mutate data in place, pass it by non-const reference

See https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rf-conventional for more details.

The upshot of this is that here are the "correct" signatures for your two functions:

vector<int> intVector(int count);

void randVector(vector<int> &v);

This doesn't take into account iterators which is probably really the correct "generic" way to write the second function but that is a bit more advanced. But, see std::shuffle which lets you randomize any arbitrary container by leveraging iterators: http://en.cppreference.com/w/cpp/algorithm/random_shuffle.

Since you mentioned unnecessary copying, I will mention that when you return things like vector by value, they should never be copied (I'm assuming you're using C++11 or newer). They will instead be "moved", which doesn't have significant overhead. Thus, in newer C++ code, "out parameters" (passing in arguments by reference to mutate them) is significantly discouraged compared to older versions. Good to know in case you encounter dated advice. However, passing in by reference for something like shuffling or sorting is considered an "in/out" parameter: you want to mutate it in place and the existing data is important, not simply being overwritten.

Nir Friedman
  • 17,108
  • 2
  • 44
  • 72
  • Thank you for your reply. If I had to switch out int value with something more substantial, like say a Person object. Then I would want to pass it as `vector intVector(const Person &p);` – AleksanderNaumenok Jul 10 '17 at 16:42
  • @Nir "Inputs should be passed by const reference, unless it is a primitive type like an integer, which should be passed by value" - actually, even if you are passing a struct it can make sense to pass it by value. Once you introduce a reference you are forcing a memory load/store (references are really just pointers), by passing by value the optimizer might be able to just put everything in registers and in any case, you are giving the constant collapsing/propagation pass a much harder job by introducing memory references. Always benchmark. Nothing is black and white. Know your compiler. – Jesper Juhl Jul 10 '17 at 16:53
  • 1
    @Jesper Juhl this, once I began reading in regards to compilers and what they do in the background, it threw me for a loop in regards to my assumptions about c++. – AleksanderNaumenok Jul 10 '17 at 16:57
  • @JesperJuhl I'm well aware of this, but getting into all of those details with someone who isn't yet sure when to use references versus pointers, why returning references from free functions is very rarely correct, etc, seems like massive overkill. Even the rule of thumb to pass types of size one word or smaller by value isn't correct: what if it has a copy constructor? But then we need to talk about trivially copyable, etc etc. Also it's rather rare that a constant can be propagated through function boundaries without inlining, AFAIK. Small inroads are being made recently with this. – Nir Friedman Jul 10 '17 at 17:59
  • @Nir fair enough. Just wanted to point out that it's not *simple*. – Jesper Juhl Jul 10 '17 at 18:06