Efficient coding with memory management

Question

I've switched recently from matlab to c++ in order to run simulations faster, however it still runs slow. I'm pretty positive that there is much to improve in terms of memory usage.

Consider the following code, it shows an example of two array/vector declaration, that I use in a simulation.

One with known fixed length (array01) and another with unknown length (array02) that changes during the run.

The question here is what is the best/proper/efficient way of declaring variables ( for both array types) in terms of memory usage and performance.

# include <iostream>
# include <vector>
# include <ctime>
# include <algorithm>
using namespace std;

const int n = 1000;
const int m= 100000;
int main()
{
    srand((unsigned)time(NULL));
    vector <double> array02;
    vector <vector<double>> Array01(n,m);
    for (unsigned int i=0; i<n; i++)
    {
        for (unsigned int j=0; j<m;j++)
        {
             array02.clear();
             rr = rand() % 10;
             for (unsigned int l = 0 ; l<rr <l++)
             {
                  array02.pushback(l);
             }
             // perform some calculation with array01 and array02
         }           
     }
}

reserve() makes sense when you have reasonably precise estimate of the total size you'll need .in this case, i think you know the size.so, vector.reserve() should help in speeding up things. — basav, Sep 30 '15 at 07:04
Profile your code before making any speculation on what needs to be optimized. By the way, static arrays will be good enough. — , Sep 30 '15 at 07:06
@basav, do you reffer to array02? what about the case when the vector size becomes smaller during the run? — jarhead, Sep 30 '15 at 07:09
@YvesDaoust, could you please elaborate more and show how you would change the code? — jarhead, Sep 30 '15 at 07:10
@YvesDaoust, sorry but what do you mean by profiling the code? — jarhead, Sep 30 '15 at 07:13
https://en.wikipedia.org/wiki/Profiling_(computer_programming). Your development environment should have a built-in profiler. This is a must-have tool when facing performance problems. — , Sep 30 '15 at 07:16
@jarhead, profilers: http://stackoverflow.com/questions/67554/whats-the-best-free-c-profiler-for-windows . But your code is not going to be fast because it makes half a billion operations on average. You should rather look into a possibility to reduce the algorithmic complexity of your code. Also, `rand()` function is quite slow and not random enough, see http://stackoverflow.com/a/31091422/1915854 — Serge Rogatch, Sep 30 '15 at 07:18
2D arrays and vectors are a performance swallowing sucker bet. because the storage backing the 2D arrays can be anywhere in RAM and nowhere near each other (read up on Spacial Locality) you can spend more time loading blocks of the matrix into cache than you do processing. With a 1D array everything is in one sequential block and, assuming you orient the array to best fit the access pattern of your data, may have already been loaded when you need it by a previous read. It is also predictable a hell, and the compiler can take advantage when optimizing. — user4581301, Sep 30 '15 at 08:08
if you really want speed and reliability, then , just go with performance libraries. intel-ipp is especially helpful for scientific computing on intel platform — basav, Sep 30 '15 at 08:47

Basile Starynkevitch · Answer 1 · 2015-09-30T07:46:33.547

You should consider defining your own Matrix class with a void resize(unsigned width, unsigned height) member function, and a double get(unsigned i, unsigned j) inlined member function and/or a double& at(unsigned i, unsigned j) inlined member function (both giving M_i,j element). The matrix internal data could be a one-dimensional array or vector of doubles. Using a vector of vectors (all of the same size) is not the best (or fastest) way to represent a matrix.

class Matrix {
   std::vector<double> data;
   unsigned width, height;
public:
   Matrix() : data(), width(0), height(0) {};
   ~Matrix() = default;
   /// etc..., see rule of five
   void resize(unsigned w, unsigned h) {
      data.resize(w*h);
      width = w; height = h;
   }
   double get(unsigned i, unsigned j) const { 
      assert(i<width && j<height);
      return data[i*width+j];
   }
   double& at(unsigned i, unsigned j)  { 
      assert(i<width && j<height);
      return data[i*width+j];
   }
}; // end class Matrix

Read also about the rule of five.

You could also try scilab (it is free software). It is similar to Matlab and might have different performances. Don't forget to use a recent version.

BTW, there are tons of existing C++ numerical libraries dealing with matrices. Consider using one of them. If performance is of paramount importance, don't forget to ask your compiler to optimize your code after you have debugged it.

Assuming you are on Linux (which I recommend for numerical computations; it is significant that most supercomputers run Linux), compile using g++ -std=c++11 -Wall -Wextra -g during the debugging phase, then use g++ -std=c++11 -Wall -Wextra -mtune=native -O3 during benchmarking. Don't forget to profile, and remember that premature optimization is evil (you first need to make your program correct).

You might even spend weeks, or months and perhaps many years, of work to use techniques like OpenMP, OpenCL, MPI, pthreads or std::thread for parallelization (which is a difficult subject you'll need years to master).

If your matrix is big, and/or have additional properties (is sparse, triangular, symmetric, etc...) there are many mathematical and computer science knowledge to master to improve the performance. You can make a PhD on that, and spend your entire life on the subject. So go to your University library to read some books on numerical analysis and linear algebra.

For random numbers C++11 gives you <random>; BTW use C++11 or C++14, not some earlier version of C++.

Read also http://floating-point-gui.de/ and a good book about C++ programming.

^{PS. I don't claim any particular expertise on numerical computation. I prefer much symbolic computation.}

@BasileStarynkevitch, I work with visual studio express 2010, with windows 7. I'm not familiar with some of the terms you have mentioned, Would you mind writing an exampe of what you explained? — jarhead, Sep 30 '15 at 07:03
No, I don't have time for that (it could mean writing many pages). This means teaching you many things. Go to your university library, there are tons of books on the subject, on which I have no particular expertise. — Basile Starynkevitch, Sep 30 '15 at 07:06

Lukáš Bednařík · Answer 2 · 2015-09-30T07:51:09.093

First of all don't try to reinvent the wheel :) Try to use some heavily optimized numerical library, for example

Intel MKL (Fastest and most used math library for Intel and compatible processors)
LAPACK++ (library for high performance linear algebra)
Boost (not only numerical, but solves almost any problem)

Second: If you need a matrix for a very simple program, use vector[i + width * j] notation. It's faster because you save an extra memory allocation.

Your example doesn't event compile. I tried to rewrite it a little:

#include <vector>
#include <ctime>

int main()
{
    const int rowCount = 1000;
    const int columnCount = 1000;

    srand(time(nullptr));

    // Declare matrix
    std::vector<double> matrix;

    // Preallocate elemts (faster insertion later)
    matrix.reserve(rowCount * columnCount);

    // Insert elements
    for (size_t i = 0; i < rowCount * columnCount; ++i) {
        matrix.push_back(rand() % 10);
    }

    // perform some calculation with matrix
    // For example this is a matrix element at matrix[1, 3]:
    double element_1_3 = matrix[3 + 1 * rowCount];

    return EXIT_SUCCESS;
}

Now the speed depends on rand() (which is slow).

score 1 · Answer 3 · answered Sep 30 '15 at 07:47

As people said:

Prefer a 1d array instead of 2d array for matrices.
Don't reinvent the wheel, use existing library: I think that Eigen library is the best suite for you, judging from your code. It also have very, very optimized code generated since it use C++ template static calculation when ever possible.

Efficient coding with memory management

3 Answers3