4

I am currently porting some C code I wrote to C++ for fun. I am struggling with a malloc() call I make in C, with h and w being constants for simplicity reasons, but later exchanged with runtime constants:

double (*g2)[h][w] = malloc(h * w * sizeof(double));

In C, this is an implicit conversion of a void*, and this of course doesn't fly with C++.

I already tried casting this with reinterpret_cast<double[h][w]>, but this is still an invalid cast.

I was wondering, how can I make this work in C++ since this would save me a lot of work?

As an alternative I'll probably use a matrix class with indirection:

struct Matrix : std::vector<double> {
    unsigned matSize;
    std::vector<double*> indirection;
    Matrix() : matSize(0) {}
    Matrix(unsigned n) : matSize(n) {
        resize(n*n);
        indirection.resize(n);
        for(unsigned i = 0; i < n; ++i) {
            indirection[i] = &(*this)[i*n];
        }
    }
    double& operator()(unsigned i, unsigned j) {
        return indirection[i][j];
    }
    const double& operator()(unsigned i, unsigned j) const {
        return indirection[i][j];
    }
};
Philipp-P
  • 45
  • 8
  • You need to cast it to the same type as `g2`, i.e., a **pointer** to an array of arrays. – Pete Becker Sep 26 '19 at 17:44
  • 4
    You'd need `std::vector>`. It is way better than a pointer to two-dimensional array. – SergeyA Sep 26 '19 at 17:46
  • 5
    Don't use a `std::vector>`. Write a matrix class that uses `std::vector` for the storage and fake that it has multiple dimensions. – NathanOliver Sep 26 '19 at 17:47
  • @PeteBecker however, such cast would be a welcome sign on the undefined territory (when result of such a cast is used), as OP would be accessing an object which lifetime hasn't officially begin. – SergeyA Sep 26 '19 at 17:48
  • @NathanOliver Since OP used 2-dimensional array in the original C code, I assumed there was a reason for this. The same packing technique could be applied to C, but since it wasn't, I assumed OP doesn't want it. – SergeyA Sep 26 '19 at 17:51
  • *I already tried casting this with reinterpret_cast* -- Rewrite whatever that routine is doing in C++, so that casts like that need never be done. If it means using a `std::vector` or a full-blown C++ matrix class, so be it. – PaulMcKenzie Sep 26 '19 at 17:51
  • 1
    @SergeyA A 2d array is very different from a 2d vector or a double pointer. With a 2d array all of the rows are sequential in memory. The others, not so much. – NathanOliver Sep 26 '19 at 17:52
  • @NathanOliver that totally depends on the usage. vectors of vectors are very easy to use if per-vector access is required (i.e. you have a routine which wants a vector). Perhaps you only access individual vector rows, and locality of different vectors is not the issue. I do not think general advise is applicable here. – SergeyA Sep 26 '19 at 17:55
  • I really don't want to use vectors here since this is a code intended for high performance I wan't to use instead a contiguous memory block. Also of course I want to reuses as much code that I can from my original since this is a quite big heat dissipation simulation code – Philipp-P Sep 26 '19 at 17:57
  • @Philipp-P this totally validates what Nathan was saying, but it would be better if you indicated so in the original question. However, if `h` and `w` are constants, what's the point of the pointer to the array in the first place? – SergeyA Sep 26 '19 at 17:59
  • 5
    *I really don't want to use vectors here since this is a code intended for high performance* -- Don't fall for this misconception. – PaulMcKenzie Sep 26 '19 at 17:59
  • To better answer the question, we need to know the rationale for having a pointer to the array in the first place. – SergeyA Sep 26 '19 at 18:02
  • 1
    Agreed. ```std::vector``` are, under the hood, arrays. Their performance is quite good overall. Re-sizing is fairly quick as well, since they using a 1.5x increasing scheme if using ```push_back```, and ```reserve()``` or ```resize()``` is also an option. – Tyler Shellberg Sep 26 '19 at 18:03
  • @Philipp-P -- If you are initializing that memory space to a value, then there is fundamentally no difference between what `vector` does and what the `malloc` is doing. The only thing is that vector chooses to always initialize the memory it has allocated, and unless the initialization is a bottleneck, there should be little to no difference between `vector` and `malloc`. – PaulMcKenzie Sep 26 '19 at 18:16
  • @SergeyA -- sorry, I don't see the connection between lifetime and the cast. The memory is used as an array of a builtin type, so there are no constructors to run. The problem I addressed was simply that the cast was going to the wrong type. Please don't involve me in your suggestions about preferences for alternatives. – Pete Becker Sep 26 '19 at 18:21
  • @PaulMcKenzie I agree on that in general, but actually, there was a reason why I used ```malloc``` and not ```calloc```. Although this makes surly not a lot of difference I would prefer to not initialise the memory to stick as close as possible to my C version. – Philipp-P Sep 26 '19 at 18:21
  • If you're really freaked out by `std::vector`, don't forget about `std::array` which isn't resizable. – tadman Sep 26 '19 at 18:33
  • @PeteBecker it doesn't matter if there is a need to run a constructor or not. Using a pointer (or reference) of type A to access the object of type B is undefined behavior (except for the several cases, of which none applies here). You need to create an object of type 'array' before you access it, and this is not happening in this code. – SergeyA Sep 26 '19 at 18:35
  • @Philipp-P -- Looking at your `malloc`, this seems to be a one-time creation of the data. Is this being called continuously in a loop? – PaulMcKenzie Sep 26 '19 at 18:43
  • I'll probably will use a matrix class I wrote some time ago consisting of contiguous memory with indirection. I just wanted to save me the hassle of adapting 500+ lines of code to it. – Philipp-P Sep 26 '19 at 18:49
  • @PaulMcKenzie of course this is a statement that is only called once – Philipp-P Sep 26 '19 at 18:50
  • @Philipp-P Ok, so this is nothing more than a syntax issue? Like some comments and answers mentioned, porting is much more than doing a line-by-line translation. How something is accomplished in C can and will be different than how something should be done in C++. Also, C and C++ are two different languages, but you're treating them as the same, but with minor syntax differences. This is not the approach to take (IMO). – PaulMcKenzie Sep 26 '19 at 18:57
  • @Philipp-P BTW, your latest edit with the `Matrix` class reminds of the old saying "premature optimization is the root of all evil". It also isn't wise to inherit from `std::vector`. – PaulMcKenzie Sep 26 '19 at 19:05
  • @PeteBecker if you are calling strict aliasing rule a **nonsense**, I think we are done with this conversation. If you want to educate yourself, you can look at comments on (now deleted) answer here, where Nathan Oliver explains the rule. – SergeyA Sep 26 '19 at 19:12
  • @PaulMcKenzie I kinda have to disagree on the first part, since this was the result of performance testing for array classes in a graduate course I took in university. In general I know what you mean and yes I have been guilty of that crime multiple time :P Would you care to elaborate on the inheritance ```std::vector```, though please? I'm not that familiar with C++, so I always can use some input there – Philipp-P Sep 26 '19 at 20:14
  • @Philipp-P It's not recommended to inherit from the standard library classes because they lack a virtual destructor so you can't delete your objects through a base class pointer - or you can, but then only the vector's destructor would get called and all kinds of problems would arise. Since creation performance is important, compare your Matrix with the non-initializing version I suggested (you can use http://quick-bench.com/). My initial tests shows the non-initializing version being ~10000 faster (for 1024x1024). I haven't tested accessing the elements in the array yet though. – Ted Lyngmo Sep 26 '19 at 20:50
  • @TedLyngmo the non initialisation is pretty nice in your version! Regarding the access time, the matrix with indirection beats access wise your version, since you use a multiplication and I precompute the pointers. I did test this intensely some years ago, so maybe with new compilers this actually changed. Well to be honest I don't care for the longer creation time that much (and additional memory), since this is only called once and the computations itself afterwards runs for over an hour and therefore I prefer faster access. Thanks for the input on the inheritance from the standard library! – Philipp-P Sep 26 '19 at 21:09
  • 1
    @Philipp-P I added note at the bottom of my answer with a link to a comparison. – Ted Lyngmo Sep 26 '19 at 21:38
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/200042/discussion-on-question-by-philipp-p-porting-c-code-to-c-problem-with-casting). – Samuel Liew Sep 27 '19 at 03:59

4 Answers4

9

Porting involves more than just making it work, line by line, so:

C:

double (*g2)[h][w] = malloc(h * w * sizeof(double));
...
g2[y][x] = ...;

C++:

std::vector<double> g2(h*w);
...
g2[y+x*h] = ...; // or
g2[y*w+x] = ...;

Using that syntax is inconvenient for accessing elements so you might want to wrap it inside a simple class. Example:

#include <iostream>
#include <iterator>
#include <vector>

class arr2d {
public:
    arr2d(size_t h, size_t w) : data_(h * w), w_(w) {}

    inline double& operator()(size_t y, size_t x) { 
        return data_[y * w_ + x];
    }
    inline double operator()(size_t y, size_t x) const {
        return data_[y * w_ + x];
    }

    // getting pointer to a row
    inline double* operator[](size_t y) {
        return &data_[y * w_];
    }
    inline double const* operator[](size_t y) const {
        return &data_[y * w_];
    }

    inline size_t width() const { return w_; }

private:
    std::vector<double> data_;
    size_t w_;
};

int main() {
    arr2d g2(3, 4);

    g2(2, 3) = 3.14159;
    // alternative access:
    g2[1][2] = 1.23456;

    std::cout << g2[2][3] << "\n";

    double* row = g2[2];
    std::copy(row, row + g2.width(), std::ostream_iterator<double>(std::cout, ", "));
    std::cout << "\n";
}

Output:

3.14159
0, 0, 0, 3.14159,

A non-initializing version could look like:

class arr2d {
public:
    arr2d(size_t h, size_t w) : data_(new double[w * h]), w_(w) {}

    inline double& operator()(size_t y, size_t x) { return data_[y * w_ + x]; }
    inline double operator()(size_t y, size_t x) const { return data_[y * w_ + x]; }

    inline double* operator[](size_t y) { return &data_[y * w_]; }
    inline double const* operator[](size_t y) const { return &data_[y * w_]; }

    inline size_t width() const { return w_; }

private:
    std::unique_ptr<double[]> data_;
    size_t w_;
};

But note that the
std::copy(row, row + g2.width(), std::ostream_iterator<double>(std::cout, ", "));
from the first example would lead to undefined behaviour.

Also note that this version will delete the copy constructor and copy assignment operator. You'll have to implement them yourself if you need them.

The creation time for the non-initializing version is of course hard to beat with any initializing version, but for access times, one might think that a lookup table, or indirection as you call it, for the rows would speed up things compared to doing the multiplication and addition in one go.

My results:
8x8 http://quick-bench.com/f8zcnU9P8oKwMUwLRXYKZnLtcLM
1024x1024 http://quick-bench.com/0B2rQeUkl-WoqGeG-iS1hdP4ah8
4096x4096 http://quick-bench.com/c_pGFmB2C9_B3r3aRl7cDK6BlxU

It seems to vary. The lookup version is faster for the 4096x4096 matrix but the naive version is faster for the two smaller ones. You need to compare using sizes close to what you'll be using and also check with different compilers. I sometimes get completely opposite "winners" when changing compiler.

Since you don't mind inheriting from std::vector or keeping extra data for a lookup-table, this could be an option. It seems to outperform the other versions slightly.

class arr2d : protected std::vector<double*> {
public:
    using std::vector<double*>::operator[]; // "row" accessor from base class

    arr2d(size_t h, size_t w) :
        std::vector<double*>(h), 
        data_(new double[w * h]), 
        w_(w),
        h_(h)
    {
        for(size_t y = 0; y < h; ++y)
            (*this)[y] = &data_[y * w];
    }

    inline size_t width() const { return w_; }
    inline size_t height() const { return h_; }

private:
    std::unique_ptr<double[]> data_;
    size_t w_, h_;
};

Here are Philipp-P's (OP:s) own measurements for the different 2D-array implementations:

8x8 http://quick-bench.com/vMS6a9F_KrUf97acWltjV5CFhLY
1024x1024 http://quick-bench.com/A8a2UKyHaiGMCrf3uranwOCwmkA
4096x4096 http://quick-bench.com/XmYQc0kAUWU23V3Go0Lucioi_Rg

Results for 5-point stencil code for the same versions:
8x8 http://quick-bench.com/in_ZQTbbhur0I4mu-NIquT4c0ew
1024x1024 http://quick-bench.com/tULLumHZeCmC0HUSfED2K4nEGG8
4096x4096 http://quick-bench.com/_MRNRZ03Favx91-5IXnxGNpRNwQ

Ted Lyngmo
  • 93,841
  • 5
  • 60
  • 108
  • I really don't want to use vectors here since this is a code intended for high performance I wan't to use instead a contiguous memory block. Also of course I want to reuses as much code that I can from my original since this is a quite big heat dissipation simulation code. – Philipp-P Sep 26 '19 at 17:56
  • 3
    @Philipp-P It is a contiguous memory block and I doubt the performance will be different from your C version of `g2`. It does the same lookup. – Ted Lyngmo Sep 26 '19 at 17:57
  • @TedLyngmo yes you are right, but it should initialise the memory if I'm not mistaken and I want to avoid that to stick as close as possible as I can to my original C version. – Philipp-P Sep 26 '19 at 18:23
  • 3
    @Philipp-P Ok, added example for that. – Ted Lyngmo Sep 26 '19 at 18:31
  • Thanks for adding the benchmarks, but I'm pretty sure although your comment states otherwise you take the initialisation into account. Please correct me if I'm wrong. Also I created this benchmark quick for `1024x1024` showing the strength of the indirection when doing a read and write: http://quick-bench.com/p5RxS77re84EmF8An_-0SuI1ptE – Philipp-P Sep 26 '19 at 22:20
  • @Philipp-P No, the initialization is outside of the benchmarked `state` loop so that's not part of it. Only things inside the loop are benchmarked. Interesting graph :-) I changed it so the same accessors are used: http://quick-bench.com/4ufgYSViJ2F3Y4U68jRlbxtnfEQ – Ted Lyngmo Sep 26 '19 at 22:30
  • Why do you think the initialization is counted? You shouldn't mix the different versions of my accessors to get a meaningful resullt. stick with the `operator[]` accessor. – Ted Lyngmo Sep 26 '19 at 22:36
  • I thought so, because of the highlighting the tool does. I never used this before – Philipp-P Sep 26 '19 at 22:37
  • 1
    Well, it is not. It's only the things in the `state` loop that is counted. – Ted Lyngmo Sep 26 '19 at 22:37
  • Sorry about the accessors in the non alt access version that was a mishap, it is getting late in my timezone. Also I assume a non compile time constant would be an interesting benchmark, to take a look at – Philipp-P Sep 26 '19 at 22:45
  • @Philipp-P :-) No worries - it's past bedtime here too. I just added my last version before bed. It looks wicked and I haven't tested it for small sizes but for 1024x1024 it does good. – Ted Lyngmo Sep 26 '19 at 22:49
  • that version looks indeed very interesting! – Philipp-P Sep 26 '19 at 22:52
  • 1
    @Philipp-P It's basically your version but with the lookup table as a base class to be able to use the built in `operator[]` from `vector`. No idea why that should matter :-) – Ted Lyngmo Sep 26 '19 at 22:54
  • Completely beats me as well. I guess we have to praise people in the compiler construction field here ;) – Philipp-P Sep 26 '19 at 23:10
  • 1
    I added some further benchmarks to your answer I hope I got them right this time :) – Philipp-P Sep 27 '19 at 00:09
3

In C++ it's not advised to manually allocate memory unless necessary. Let the standard library and templates do the work for you.

They can be very useful and are great to learn if you want to get into C++! You can save a lot of time this way and write some better code.

For example, what is this data type used for? If it fits for your usage, you may instead consider creating a 2D array using std::array:

std::array<std::array<double, w>, h>

If you need to re-size the arrays regularly, std::vector could be used instead. It has effectively the same performance as an array, given that is is all it is under the hood. You can reserve() or resize() as necessary and push_back uses a 1.5x increasing scheme and is good at its job.

EDIT: Since size is known, arrays may be better here. Taken suggestion from comments.

Tyler Shellberg
  • 1,086
  • 11
  • 28
  • 2
    2d vectors are almost never what you really want. The lack memory locality for the rows of the vector so you can suffer a major performance penalty. – NathanOliver Sep 26 '19 at 17:53
  • @NathanOliver Thanks! I am fairly new to C++ myself. What would you recommend instead? ```std::array```? – Tyler Shellberg Sep 26 '19 at 17:54
  • `std::array` is a good choice if you know the size at compile time. If not then write a matrix class that uses 1d `std::vector` for the storage and fake that it has multiple dimensions using math. – NathanOliver Sep 26 '19 at 17:55
3

You can do this, which should work in both C and C++:

double *g2 = (double*) malloc(h * w * sizeof(double));

Though, as others have stated, this is not the way to approach this issue in C++ in general. For instance, you should use a std::vector instead:

#include <vector>
std::vector<double> g2(h * w);

In both cases, you end up with a dynamically allocated 2D array of doubles in a single contiguous block of memory. And as such, you need to then use g2[(row*w)+col] syntax to access the individual elements, where 0 <= row < h and 0 <= col < w.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • 1
    This isn't legal C++ per https://timsong-cpp.github.io/cppwp/intro.object#1.sentence-2 – NathanOliver Sep 26 '19 at 18:11
  • 1
    @NathanOliver [Existence of objects created in C functions](https://stackoverflow.com/q/46909105/65863) – Remy Lebeau Sep 26 '19 at 18:39
  • @tadman in C, no cast is needed, but it does work. But in C++, a cast is needed. `malloc()` returns a `void*`, and C++ does not allow a `void*` to be assigned to any pointer type other than a `void*`, so an explicit cast is required. – Remy Lebeau Sep 26 '19 at 18:40
  • Ah, I see what you're saying if you're going for both C and C++. If you're going to assert that then in C++ you'd use `static_cast`. A middle-of-the-road approach seems broken in both languages in different ways. – tadman Sep 26 '19 at 18:42
  • @RemyLebeau The only answer there that cites the standard missed the object requirement. I've left a comment. – NathanOliver Sep 26 '19 at 18:47
  • 2
    @RemyLebeau We have a paper to try and make it legal but as is it is not: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0593r4.html – NathanOliver Sep 26 '19 at 19:33
3

I recommend you using std::vector. Just wrap it if you want 2D array syntax.

#include <iostream>
#include <vector>

class Array2D {
    std::vector<double> _v;
    size_t _width;
    size_t _height;
public:
    Array2D(size_t height, size_t width, double initVal = 0.0)
        : _v(width * height, initVal),
        _width(width),
        _height(height)
    {}

    double* operator[](size_t y) {
        return _v.data() + y * _width;
    }
};

int main(int, char**) {
    size_t rows = 5;
    size_t cols = 3;
    Array2D a(rows, cols, 0.2);

    for (size_t i = 0; i < cols; ++i)
        a[4][i] = -0.1 * i;

    std::cout << a[4][2] << std::endl; //-0.2

    return 0;
}
Gerard097
  • 815
  • 4
  • 12
  • This seems like the best answer so far. Contiguous memory, and the class wrapping still provides for `obj[i][j]` notation. – sweenish Sep 26 '19 at 19:01