3

A discussion along this line could be found in this question and also in here, but my case is slightly different, as I am dealing with a dynamically allocated memory.

also please note, memset does not quite work with double value.

Anyway, I am trying to use std::fill to fill a dynamically allocated 2D array --

#include <iostream>
#include <algorithm>

using std::cout ; using std::endl ;
using std::fill ;

int main()
{
    double **data ;
    int row = 10, col = 10 ;
    data = new double*[row] ;
    for(int i = 0 ; i < col ; i++)
        data[i] = new double[col];

    // fill(&(data[0][0]), 
    //      &(data[0][0]) + (row * col * sizeof(double)), 1); // approach 1
    // fill(&(data[0][0]), &(data[0][0]) + (row * col), 1);   // approach 2
    // fill(data, data + (row * col * sizeof(double)), 1);    // approach 3
    // fill(&(data[0][0]), 
    //      &(data[0][0]) + 
    //       ((row * sizeof(double*)) + 
    //        (col * sizeof(double))), 1);                    // approach 4

    for(int i = 0 ; i < row ; i++) {
        for(int j = 0 ; j < col ; j++)
            cout << data[i][j] << " " ;
        cout << endl ;
    }

    for(int i = 0 ; i < row ; i++)
        delete [] data[i] ;
    delete [] data ;
}

Approach 1: What I understand, the approach 1 should be the correct code, I am starting from the beginning &(data[0][0]) and the end of the array should be located at &(data[0][0]) + (row * col * sizeof(double)), but when I run, I get this error, but the array has been filled --

1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
*** Error in `./test': free(): invalid next size (fast): 0x0000000000da3070 ***
Aborted (core dumped)

Approrach 2: However, according to this post, the approach 2 is recommended, but I do not quite understand this code, since sizeof(double) is missing, and I am getting this output --

1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 
1 1 1 1 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 
*** Error in `./test': free(): invalid next size (fast): 0x0000000000bf5070 ***
Aborted (core dumped)

Approach 3: I am not sure why this does not compile, data and &(data[0][0]) should have the same meaning, right?

Approach 4: I am not sure if this is correct or not.

  1. How do I do it ?
  2. Does std::fill give any extra benefit over two nested loops?
Community
  • 1
  • 1
ramgorur
  • 2,104
  • 4
  • 26
  • 39

3 Answers3

9

Unlike a stack allocated 2D array, a dynamic 2D array is not guaranteed to be a contiguous range. It is however a contiguous range of pointers, but each pointer in the array may point to non-contiguous memory areas. In other words, the first element of data + i + 1 may not necessarily follow the last element of the array pointed by data + i. If you wonder why a stack-allocated 2D array is contiguous, it is because when you declare something like

double data[10][20];

then the compiler understands it as array of 10 (contiguous) elements, each element of type double[20]. The latter type is also an array, which guarantees contiguous elements, so the double[20] elements (i.e. 20 double one after the other) are stacked one after the other in the memory. double[10][20] is is strikingly different from double**.

That's why std::fill or std::memset gives you headaches, as they both assume a contiguous range. Therefore in your case a nested loop seems to be the simplest way of filling the array.

In general, it is much better to use a 1D array in which you "mimic" the 2D access, exactly for the reasons mentioned above: data locality. Data locality implies fewer cache missed and better overall performance.

vsoftco
  • 55,410
  • 12
  • 139
  • 252
3

Pointer arithmetic requires that a pointer be incremented only to the extent that the result still points at the same array (or one past the end).

You allocate each row as a separate array in your for loop:

for(int i = 0 ; i < col ; i++)
    data[i] = new double[col]; // this creates a distinct array for each row

Since each row array you allocate is col elements, the maximum value that can legally be added to &(data[0][0]) is col. But in each of your examples of std::fill usage you add more to the pointer than you are allowed.

Given the way you are allocating the array, there's no way for you to pass raw pointers to a single call to std::fill in order to initialize the entire 2D array. Either you must use more than one call to std::fill (which defeats the purpose of using std::fill), or you must create an Iterator type that knows how to deal with the separate allocation of each row, or you must change the way you allocate the array.

I would recommend allocating the whole array at once as a single dimensional array and then writing some additional code to make it work like a two dimensional array. This has a number of benefits:

  • The standard library contains a convenient way of dynamically allocating a single dimensional arrays: std::vector
  • Using std::vector means you no longer need to use naked new and delete, which fixes the exception safety problem your code has.
  • A single allocation generally has better performance characteristics than many allocations (Of course, there are cases where separate allocations are better).

Here's a simple wrapper class to make a 1D array look like a 2D array:

class Matrix {
  std::vector<double> data;
  int cols;

public:
  Matrix(int row, int col)
    : data(row * col)
    , cols(col)
  {}

  auto begin() { return data.begin(); }
  auto end() { return data.end(); }

  struct index_type { int row; int col; };

  double &operator[](index_type i) {
    return data[i.row * cols + i.col];
  }

  int row_count() const { return data.size()/cols; }
  int column_count() const { return cols; }
};

Using this you can rewrite your code:

#include "Matrix.h"

#include <iostream>
#include <algorithm>

using std::cout ; using std::endl ;
using std::fill ;

int main()
{
    Matrix data(10, 10);

    fill(data.begin(), data.end(), 1.0);

    for(int i = 0 ; i < data.row_count() ; i++) {
        for(int j = 0 ; j < data.column_count() ; j++)
            cout << data[{i, j}] << " " ;
        cout << endl ;
    }
}

Does std::fill give any extra benefit over two nested loops?

Using loops is less readable because loops could do lots of other things, and you have to spend more time figuring out what any particular loop is doing. For this reason one should always prefer using STL algorithms over manual loops, all else being equal.


// fill(&(data[0][0]), 
//      &(data[0][0]) + (row * col * sizeof(double)), 1); // approach 1

Pointer arithmetic automatically considers the size of the array elements. You don't need sizeof(double). Multiplying by sizeof(double) here is the same as multiplying by sizeof(double) inside []. You wouldn't do: data[i * sizeof(double)], so don't do data + (i * sizeof(double)).


Your example code uses &(data[0][0]). Think about if this the same or different from data[0]. Consider both the type and the value of the expressions.

bames53
  • 86,085
  • 15
  • 179
  • 244
  • Upvoted. I'd also write the 'const' version of operator[]. Any particular reason for using a cutom indexing instead of a std::pair or two size_t parametars? – vsoftco Dec 27 '15 at 01:36
  • @vsoftco I took the `const` members out simply because they don't happen to be needed here and I wanted to keep the code simple. But, yes, in real code they should be there (as well as a bunch of other stuff). – bames53 Dec 27 '15 at 01:51
  • @vsoftco I used a custom index type both for the custom member names (as opposed to `.first`, `.second`) and also because this way static typing can prevent things like mixing up `Matrix::index_type` with `SomethingElse::index_type`. Sometimes using generic types like `std::pair` too much can cause "Stringly typed" issues. Using two index parameter would require me to move away from the `[]` overload, which is fine, I just happen to prefer not to do that. – bames53 Dec 27 '15 at 01:53
  • Fair enough, thanks! Still believe one should propose a basic matrix class into the Standard. C++ is probably the only major language that lacks it. There is Boost multiarray of course, but still didn't make its way into the Standard. – vsoftco Dec 27 '15 at 01:54
  • @vsoftco I think the [array_view proposal](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4512.html) covers most of what's needed. – bames53 Dec 27 '15 at 02:13
1

I agree with the above comments. You have allocated 10 separate arrays so you can't initialize these with a single std::fill call. Moreover, when you perform arithmetic operations on pointer of non-void types, a compiler automatically multiply your results by sizeof of a given type. However, when you use functions like memset or memcpy, you actually have to multiply number of elements by sizeof of a given types and pass it to one of these functions. It's because these function operate on bytes and they accept pointers of void type. Therefore it is impossible for compiler to take care of adjusting of sizes, because the void type has no specified size.