6

I'm converting some functions from Matlab to C++, and there are something to do with matrix. I found this simple function somewhere on the Internet:

typedef std::vector<std::vector<double> > Matrix;

Matrix sum(const Matrix& a, const Matrix& b) {
  size_t nrows = a.size();
  size_t ncols = a[0].size();
  Matrix c(nrows, std::vector<double>(ncols));
  for (int i = 0; i < nrows; ++i) {
    for (int j = 0; j < ncols; ++j) {
      c[i][j] = a[i][j] + b[i][j];
    }
  }
  return c;
}

Can anyone explain me why they used const Matrix& a as the input, instead of Matrix a? Are they using it as a habit, or is there any benefit of using it since I did not see any difference between the results of 2 versions (const Matrix& a and Matrix a as input).

scmg
  • 1,904
  • 1
  • 15
  • 24
  • 2
    By taking by reference, not by value, you avoid object copying. By adding const you promise caller this function wont change anything. – Galimov Albert Apr 02 '16 at 06:28
  • 3
    Off topic: `std::vector >` is one of the easiest ways to make a matrix. It is one of the safest. It can also be one of the slowest. If you want performance, think hard about making a 2D anything except for a static array. Good write-up on a better way to do it found here: https://isocpp.org/wiki/faq/operator-overloading#matrix-subscript-op – user4581301 Apr 02 '16 at 06:45
  • @user4581301 do you have any suggestion or (plausible) reference about making a 2D matrix (besides isocpp)? – scmg Apr 02 '16 at 06:58
  • 1
    Just profiling. What happens is with a vector of vectors you get an outer vector if size nrows and then nrows inner vectors of size ncols. All of these vectors are different blocks of memory that could be anywhere in the system's memory reducing the compilers optimization options and CPU's ability to predict accesses and cache data. Look up spatial locality and its effect on cache misses. When possible, use a 1D container which guarantees contiguous data and do the arithmetic to map the 1D to 2D with a wrapping function. – user4581301 Apr 02 '16 at 07:10
  • 2
    @scmg You are asking very basic questions about C++. Language syntax (references, `const` keyword) and importance of keeping data contiguous in memory is not something you want to learn from SO answers. You need to pick [a book](http://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list) and start learning language basics if you are going to make any use of it. If you will not do this you are doomed to ask questions like "Why is this C++ matrix multiplication code N times slower than similar Matlab code" later. – Ivan Aksamentov - Drop Apr 02 '16 at 07:13
  • 1
    Doing the math yourself looks like a performance hit, but the compiler is generating very similar work for you when it sees `a[row][col]` and both resolve down to nearly the same thing. The increased cache friendliness more than makes up for the rest in most cases. – user4581301 Apr 02 '16 at 07:15
  • Also to add, `const Type& a` also allows to pass an `rvalue` as a parameter – puerile Feb 25 '22 at 07:23

2 Answers2

7
  • Reference is to avoid copying large size objects. However, for small objects, it is more preferable to have non-reference, since reference-creation will incur more overhead than it would save. For example <=pointer-size data type on a given platform.
  • const is to make sure that given object will not be changed by function, which is a safety-check for the programmer writing given function, a contract for the caller. It also makes the caller to pass a non-const as well as const object.
  • Some may argue that const without reference doesn't make sense (as original object will not be modified anyway). But, as a safety-net, it is advised for function implementer to have const parameters, so that by mistake function doesn't change its arguments. That's the reason, some functional languages have constness by default (unless you make them mutable).
Ajay
  • 18,086
  • 12
  • 59
  • 105
5

What you don't see is without profiling the code and looking for differences in performance is what is happening behind the scenes.

A Matrix a parameter is pass by value. The source Matrix will be copied into Matrix a, and depending on the size of the matrix, this may take a while. A 3x3 matrix isn't much, but if it happens a lot... And if instead of 3x3 you have a 300x300 matrix, that is a lot of vector construction and data copying. Probably going to be a performance killer. It will almost certainly certainly take longer than the pass by reference version of the function, const Matrix& a which will only copy the address of the source Matrix unless the compiler sees some advantages in performing a copy of the Matrix itself.

So you could get by with Matrix & a, but that allows the function to play with the contents of Matrix a, possibly to the detriment of the calling function. Changing the declaration to const Matrix & a promises the caller that a will not be changed inside the function and the compiler backs this up by refusing to compile if you try. This is mostly a safety thing. It prevents the possibility of subtle mistakes introducing hard-to-detect errors.

user4581301
  • 33,082
  • 7
  • 33
  • 54