STL containers speed vs. arrays

Question

I just started working on a scientific project where speed really matters (HPC). I'm currently designing the data structes. The core of the project is a 3D-Grid of double values, in order to solve a partial differenital equation.

Since speed here is a probably bigger concern then simplicity of the code, I'd like to know how the STL performs compared to usual C-style arrays. In my case, since it's a 3D-grid, I was thinking of a) a one dimensional vector with linear indexing b) a vector of 3 vectors or c) a one dimensional c-style array or d) a three dimensional c-style array.

I looked up older questions, but I only found questions concering construction/destruction (which is not important here, as the datastructures are only created once at program start - fast indexing and computations on it are important) or comparison of different STL containers.

Thanks for help

You should take a look at [`boost::multi_array`](http://www.boost.org/doc/libs/1_53_0/libs/multi_array/doc/user.html) - it's quite likely to be very good for both performance and design in this case. — Joseph Mansfield, Feb 20 '13 at 19:48
If you want to do PDE's have a look at Numerical Recipes, http://www.nr.com/, this has been updated to C++ — QuentinUK, Feb 20 '13 at 19:54
A vector of vectors is probably the worst choice (you'll get fragmented memory). What you probably want is a basically [this](http://stackoverflow.com/a/6465254/179910) extended to 3 dimensions. In actual use, this will typically be the same speed as an array. — Jerry Coffin, Feb 20 '13 at 19:56
HPC often goes with MPI and MPI does not like to follow pointers, hence you would most probably need flat memory layout for your arrays. This rules out structures like `vector>` — Hristo Iliev, Feb 22 '13 at 13:13

score 7 · Accepted Answer · answered Feb 20 '13 at 20:00

It's hard (or even impossible) to say in advance what the relative performances will be. Generally, on modern machines, using a flat vector, and calculating the indexes into it, will outperform a vector of vector of vector; on older machines, the reverse was true. (On modern machines, multiplication is usually cheaper than the page misses due to poor locality. On older machines, multiplication was expensive, and there weren't caches, so locality didn't make a difference.)

Also, depending on the machine and the library implementation, using an std::vector for this flat vector may be more expensive than using a simple pointer to the memory (or it may not be—you can't know in advance). I'd still go for the vector first, wrap everything carefully in a controlling class, and if there were still performance problems, switch to the pointer.

score 4 · Answer 2 · answered Feb 20 '13 at 19:49

4

The memory layout of the 1D and 3D arrays is going to be the same, and similar to the memory layout of a linear std::vector. I would take that approach: a unidimensional vector and an inlined function to map to the appropriate location.

The vector of vectors, on the other hand has a completely different layout, as the data in the vector is not stored inside the object but dynamically allocated and referenced. This means that two different rows in a std::vector<std::vector<int>> might be in completely unrelated locations in memory.

answered Feb 20 '13 at 19:49

David Rodríguez - dribeas

204,818
23
294
489

it's worth saying, the for your three dimensions, (row, column, 'sheet'), the inner dimension will be stored as one vector, and will have equally good access times, and this may be ok for. Access to this on vector will be rather good, but to others you will have a performance hit over a 1D structure. you will also need to use `std::vector>>` to get your three dimensions – thecoshman Feb 20 '13 at 19:57
2

@thecoshman You also run the risk that the structure becomes ragged with the nested vectors. Using a flat vector and calculating the index is probably safest. Then, once everything is working, you can change the internals of the class to find out which solution works best. – James Kanze Feb 20 '13 at 20:02
indeed, but no uniform may be something that is desired. Still, I do think that wrapping this up in a interface would not be a bad idea, at least during testing. Would make it very easy to swap the implementation details to see what works best. Then, if you really wanted to, you could strip out that interface. – thecoshman Feb 20 '13 at 20:03

bames53 · Answer 3 · 2013-02-20T20:02:53.130

2

A vector will do internally what you need to do manually in order to handle an individually sized raw array, therefore as long as they are used properly the vectors should perform the same as raw arrays performing the same task.

For example the single vector should perform the same as the single dimensional c-array, and the vector of vectors should perform approximately the same as if you used a raw array of pointers, each element of which points into an array.

However if the 3d array has uniform dimension sizes (i.e., not ragged arrays) then vectors pay an extra cost to individually manage their sizes (e.g., they have to store their sizes individually). If any of the dimension sizes are known at compile time then you'd be better off using the 'STL' container 'std::array, because it won't have that unnecessary overhead. You can even have multi-dimentionalstd::arrays`:

template<typename T, int Planes, int Rows, Cols>
using Matrix = std::array<std::array<std::array<T,Cols>,Rows>,Planes>;

Although not strictly required this should be the same as T arr[Planes][Rows][Cols];, but without the problems of raw c-arrays.

edited Feb 20 '13 at 20:02

answered Feb 20 '13 at 19:50

bames53

86,085
15
179
244

you only have a cost to pay if the vectors are resizing, do not change there size, no cost. – thecoshman Feb 20 '13 at 20:00
@thecoshman the cost I'm referring to is the memory for storing the length of each vector individually. – bames53 Feb 20 '13 at 20:03
oh pish posh, that's a few bytes. You need to get an order of magnitude more dimensions before that starts to become a concern. – thecoshman Feb 20 '13 at 20:04
correct me if I am wrong, but isn't a matrix simple 2D... not sure what the proper term for a 3D structure like this is. – thecoshman Feb 20 '13 at 20:06
@thecoshman In mathematics 'matrix' typically does mean 2d, but other usages allow higher dimensions. In mathematics I think higher dimensional matrices are called 'tensors'. – bames53 Feb 20 '13 at 20:13
ah yes, tensors brings back memories of painfully tedious maths lessons. Have an upboat good sir – thecoshman Feb 20 '13 at 22:27

score 0 · Answer 4 · answered Feb 22 '13 at 13:46

A widely used in HPC dynamically allocated static (in terms of unchangeable dimensions after allocation) array object is the combination of a flat array and a dope vector. Basically the idea is to allocate a big flat chunk of memory and then to build a tree of pointers into it. For 2D arrays the tree is just a simple linear list of pointers to the beginning of each row. For 3D arrays the tree has two levels and each of the second level elements points to a row from the 2D slice that corresponds to the first level. Placing the dope vector tree at the beginning of the allocated memory allows you to directly apply the [] indexing operator, e.g. A[i][j][k], but some care has to be taken as &A[i] would not be the beginning of the i-th 2D slice.

One problem of this approach is that the dope vector tree could grow very large in size. For example, the size of this data structure for a 1000x1000x1000 array on a 64-bit machine is 1000x1000x8 bytes or almost 8 MiB. This could take precious cache space for certain data access patterns.

STL containers speed vs. arrays

4 Answers4

Linked