Return a pointer or value when objects are large

Question

Say we make a Matrix<n, m> class, that stores nxm integers in a member variable std::array<std::array<int, m>, n> inner;. Now there are two ways of making add:

Method 1) Returning by value (constexpr is possible)

template<int n, int m> class Matrix {
    ...
    constexpr Matrix add(const Matrix& that) const {
        matrix ret;
        for (int y = 0; y < n; y++)
            for (int x = 0; x < m; x++)
                ret.inner[y][x] = this->inner[y][x] + that.inner[y][x];
        return ret;
    }
    ...
}

Method 2) Returning a pointer (constexpr is not possible)

template<int n, int m> class Matrix {
    ...
    Matrix *add(const Matrix& that) const {
        Matrix *ret = new Matrix();
        for (int y = 0; y < n; y++)
            for (int x = 0; x < m; x++)
                ret->inner[y][x] = this->inner[y][x] + that.inner[y][x];
        return ret;
    }
    ...
}

My program need to do arithmetic with 1000x1000 matrices (images), so using Method 1 I get a stackoverflow immediately. My program also works with small 4x4 matrices (density matrices in quantum computations) that need to be computed at compile time, so using Method 2 is impossible in these constexpr initializations.

Question: do I need to make two versions of every method returning Matrix? (one returning a Matrix and the other returning a Matrix*) This will be a lot of duplicate code. How common is this problem? Is there an alternative way of using the heap so that Method 2 is also possible from constexpr?

This answer mentions that move-semantics have shifted the preference a little bit to Method 1. But how about the resulting stack-overflows?

Don't use `std::array, n>`, use an appropriately sized `std::vector`. — molbdnilo, Apr 27 '20 at 13:10
That is not possible unfortunately. I do calls to an external dll (the Math Kernel Library) that expects the matrices to be stored like `std::array`. — Carucel, Apr 27 '20 at 13:11
the part that actually does cause the overlow is not in the code you posted. If `Matrix` does manage the dynamic memory you do not need to dynamically allocate the `Matrix` — 463035818_is_not_an_ai, Apr 27 '20 at 13:24
If you mean Intel's MKL, it uses one-dimensional arrays in the C interface and does not have a C++ interface. — molbdnilo, Apr 27 '20 at 13:24
Only 1000x1000? If you don't have too many threads, or you are using 64-bit binary, just increase stack size. — Alex Guteniev, Apr 27 '20 at 13:25
@molbdnilo: I use an `extern "C"` call to the dll to use the library @AlexGuteniev: Yes, the problem is that parts are also multithreaded and I am afraid of the extra required memory, especially because I might increase the resolution to a sqaure matrix of dimension `10000` — Carucel, Apr 27 '20 at 13:27
@Carucel `std::array` and `std::vector` both store their elements consecutively. It's the same layout. It shouldn't be a problem using an `extern C` interface. — François Andrieux, Apr 27 '20 at 13:27
@FrançoisAndrieux Thank you, I didn't know that. I prefer to use `std::array`, because of performance, but if there is no other way, I will change to using `std::vector` then. — Carucel, Apr 27 '20 at 13:30
@Carucel It's not clear what performance gains you are expecting. But if you want to move an `std::array` off the stack, you'll end up with an `std::vector`. If you want to avoid dynamic memory allocation or trying to keep locality with stack objects, any solution that moves your data off the stack will break those anyway. — François Andrieux, Apr 27 '20 at 13:33
@FrançoisAndrieux, I was thinking about `std::vector` managing it's size, which can be variable, but maybe there is no performance gain indeed if I know the size in advance. — Carucel, Apr 27 '20 at 13:36
@Carucel If you don't want `std::vector`'s size and capacity management, you could use `std::unique_ptr>` instead. But you'll have to implement copy construction, copy assignment and comparison yourself. — François Andrieux, Apr 27 '20 at 13:41

Aconcagua · Answer 1 · 2020-06-10T10:22:03.777

My proposition:

template<size_t n, size_t m>
class Matrix
{
public:
    Matrix& operator+=(const Matrix& other)
    {
        return *this;
    }
};

template<size_t n, size_t m>
Matrix<n, m> constexpr operator+(Matrix<n, m> x, Matrix<n, m> const& y)
{
    x += y;
    return x;
}

template<size_t n, size_t m>
std::unique_ptr<Matrix<n, m>> add(Matrix<n, m> const* x, Matrix<n, m> const* y)
{
    auto result = std::make_unique<n, m>(*x);
    *result += *y;
    return std::unique_result;
}

Now you can use addition by value for small matrices:

Matrix<10, 12> m1, m2;
auto m3 = m1 + m2;

and large matrices you could allocate dynamically:

auto pm1 = std::make_unique<Matrix<12, 10>>();
auto pm2 = std::make_unique<Matrix<12, 10>>();
auto pm3 = add(pm1.get(), pm2.get());

Or even:

auto pm3 = std::make_unique<Matrix<12, 10>>(pm1);
*pm3 += *pm2;

I think you forgot the actual addition in `operator+=`. Maybe put a placeholder comment at least. — François Andrieux, Apr 27 '20 at 13:52
@FrançoisAndrieux It's inside the class... Admitted, I don't add anything inside, but for demonstration that should suffice. — Aconcagua, Apr 27 '20 at 18:21

Return a pointer or value when objects are large

1 Answers1