1

I have a simple 2D (row, column) matrix which I currently reorder according to the algorithm below, using another array as final container to swap items.

The problem is that I need to save memory (the code is running on a very low end device), and thus I need to figure a way to reorder the array in-place.

The algorithm is as follows:

for (int iRHS = 0; iRHS < NUM_VARS; iRHS++)
    for (int iRow = 0; iRow < _numEquations; iRow++) {
        coef[iRHS][iRow] = _matrixCoef(iRow, iRHS); 
    }

Note: coef is a pointer to double accessed via subscript, _matrixCoef is a matrix helper class and uses a vector of double accessed by operator(row,col). Here I want to eliminate coef, so that all values are reordered in _matrixCoef in-place instead.

Edit: NUM_VARS is a define set to 2.

Is this possible in-place after all?

Edit 2:

Here is the matrix class which is accessed above via operator overload (row, col):

struct Matrix
{
    /// Creates a matrix with zero rows and columns.
    Matrix() = default;
    /// Creates a matrix with \a rows rows and \a col columns
    /// Its elements are initialized to 0.
    Matrix(int rows, int cols) : n_rows(rows), n_cols(cols), v(rows * cols, 0.) {}
    /// Returns the number or rows of the matrix
    inline int getNumRows() const { return n_rows; }
    /// Returns the number or columns of the matrix.
    inline int getNumCols() const { return n_cols; }
    /// Returns the reference to the element at the position \a row, \a col.
    inline double & operator()(int row, int col) { return v[row + col * n_rows]; }
    /// Returns the element at the position \a row, \a col by value.
    inline double operator()(int row, int col) const { return  v[row + col * n_rows]; }
    /// Returns the values of the matrix in column major order.
    double const * data() const { return v.data(); }
    /// Returns the values of the matrix in column major order.
    double * data() { return v.data(); }
    /// Initialize the matrix with given size. All values are set to zero.
    void initialize(int iRows, int iCols)
    {
        n_rows = iRows;
        n_cols = iCols;
        v.clear();
        v.resize(iRows * iCols);
    }
    
    void resize(int iRows, int iCols)
    {
        n_rows = iRows;
        n_cols = iCols;
        v.resize(iRows * iCols);
    }
private:
    int n_rows = 0;
    int n_cols = 0;
    std::vector<double> v;
};
benjist
  • 2,740
  • 3
  • 31
  • 58
  • 2
    This code looks like matrix transposition, is your reordering algorithm different ? If yes, in what way ? Can you give us insights of what the _MatrixCoef helper does internally ? – trialNerror Nov 21 '21 at 23:52
  • See [this question](https://stackoverflow.com/questions/25932812/how-to-efficiently-transpose-non-square-matrices). If the matrix is square, this is easily doable by swapping elements, but if the matrix isn't square you either need a new one, need to change how you access the existing memory, or need more complicated code to move elements as it isn't a simple swap of two values. – 1201ProgramAlarm Nov 22 '21 at 00:30
  • @trialNerror _matrixCoef is one of the two resulting matrices from a linear system solved before. E.g., no easy part to change how to order items at this place. – benjist Nov 22 '21 at 00:32
  • With NUM_VARS == 2, your task is to perform the inverse of a "perfect shuffle". There's a lot you can google about perfect shuffles that also applies to the inverse operation. Doing it in-place in linear time is tricky. – Matt Timmermans Nov 22 '21 at 16:41

3 Answers3

1

After you posted code, I will suggest another solution, that's rather simple and quick to implement.

In your current Matrix class:

struct Matrix
{
    // ...

    // add this:
       void transpose()
       {
           is_transposed = !is_transposed;
       }
    // ...

    // modify these:

    /// Returns the number or rows of the matrix
    inline int getNumRows() const { return (is_transposed) ? n_cols : n_rows; }
    /// Returns the number or columns of the matrix.
    inline int getNumCols() const { return (is_transposed) ? n_rows : n_cols; }
    /// Returns the reference to the element at the position \a row, \a col.
    inline double & operator()(int row, int col) 
    {
        if (is_transposed) 
            return v[col + row * n_rows]; 
        return v[row + col * n_rows]; 
    }
    /// Returns the element at the position \a row, \a col by value.
    inline double operator()(int row, int col) const 
    { 
        if (is_transposed) 
            return  v[col + row * n_rows]; 
        return  v[row + col * n_rows]; 
    }

private:
    // ...

    // add this:
    bool is_transposed = false;
};

You may want to modify other member functions, depending on your application.

Michaël Roy
  • 6,338
  • 1
  • 15
  • 19
0

Well, assuming your matrix is a square matrix, that is NUM_VARS == _numEquations, it is possible. Otherwise the size of the resulting matrix will not allow in-place transposition. In that case, a solution would be to modify the downstream computations to swap the row/column indices while doing the operations.

But if it's square, you can try this:

for (size_t r = 0; r < NUM_VARS; ++r)
    for (size_t c = 0; c < NUM_VARS; ++c)
    {
         if (&_matrixCoef[r][c] < &_matrixCoef[c][r])
             std::swap(_matrixCoef[r][c], _matrixCoef[c][r]);
    }
Michaël Roy
  • 6,338
  • 1
  • 15
  • 19
  • 1
    Maybe change the inner loop to `for (size_t c = r + 1; c < NUM_VARS; ++c)` – Charlie S Nov 22 '21 at 00:25
  • Unfortunately, NUM_VARS is a define and fixed at 2. Sorry for not mentionin that. – benjist Nov 22 '21 at 00:27
  • @CharlieS I'm pretty sure there is a very clever ways to reduce the number of iterations. I opted for the safest route. Any other options would need some thorough testing, I'd rather leave that optimization task to the OP. – Michaël Roy Nov 22 '21 at 00:30
  • 1
    @benjist Then you still have the option to change the way the downstream operations are accessing the matrix, by interverting row and column index in these functions. – Michaël Roy Nov 22 '21 at 00:32
  • This isn't safe. Comparing pointers is only valid if the pointers point to the same array, or to non-static members of an object. For example, if `_matrixCoef` is implemented with a `vector>` then the comparison is invalid (although in practice it is likely to work). The comparison could just as easily be `r < c` (or `r > c`), you can start the inner loop at `r+1` and dispense with the comparison completely and reduce the time it takes to run. – 1201ProgramAlarm Nov 22 '21 at 00:47
  • @1201ProgramAlarm Comparing addresses of variables that belong to a single co,tainer is as safe as it gets. Otherwise the entire implementation of std::vector<> would be UB. – Michaël Roy Nov 22 '21 at 00:55
  • @1201ProgramAlarm I'll take that back... Comparing addresses is as safe as it gets, Unconditionally. Especially for this application. – Michaël Roy Nov 22 '21 at 01:05
0

If you want to explicitly re-order the data array in-place:

Consider this post that details in-place re-ordering of an array given a permutation. I have modified it only slightly. Algorithm to apply permutation in constant memory space

The expression for the permutation can be deduced as follows:

For index k = i + j * rows, j = k / rows and i = k - j * rows (integer division). The index of the transposed matrix is k_transpose = j + i * cols. Now for the code:

int perm(int k, int rows, int cols)
{
    int j = k / rows;
    int i = k - j * rows;
    return j + i * cols;
}
template<typename Scalar>
void transpose_colmajor(Scalar* A, int rows, int cols)
{
//tiny optimization: for (int i = 1; i < rows * cols - 1; i++)
    for (int i = 0; i < rows * cols; i++) {
        int ind = perm(i, rows, cols);
        while (ind > i)
            ind = perm(ind, rows, cols);
        std::swap(A[i], A[ind]);
    }
}

Example with a symbolic matrix:

    int rows = 4;
    int cols = 2;
    char entry = 'a';
    std::vector<char> symbolic_matrix;
    for (int col = 0; col < cols; col++)
    {
        for (int row = 0; row < rows; row++)
        {
            symbolic_matrix.push_back(entry);
            entry++;
        }
    }
    for (char coefficient : symbolic_matrix)
        std::cout << coefficient << ",";
    std::cout << "\n\n";

    transpose_colmajor(symbolic_matrix.data(), rows, cols);

    for (char coefficient : symbolic_matrix)
        std::cout << coefficient << ",";
    std::cout << "\n\n";

Output:

a,b,c,d,e,f,g,h,

a,e,b,f,c,g,d,h,

Of course, you will also have to update your struct with the correct rows and columns. The asymptotic computational complexity is not guaranteed to be linear, but there are no auxiliary data structures that consume your precious memory!

EDIT: I have to admit, I learned a lot from this question. I was wondering why transposing a square matrix is so intuitive while the rectangular case is not. It has to do with the cycles of the permutation. This algorithm https://www.geeksforgeeks.org/minimum-number-swaps-required-sort-array/ calculates the minimum swaps required for a given permutation. If you plug in the permutation expression above, you can see that the minimum number of swaps increases dramatically when rows != cols. For example, a 4 x 4 matrix requires 6 swaps, whereas a 4 x 3 matrix requires 8, which to me is very counter-intuitive! The algorithm I posted requires rows * cols - 2 swaps, which is not optimal. However, based on random experimentation of rectangular matrices, it appears that the gap between the minimum swaps is pretty small and is less important for larger matrices. It is definitely worth specializing the transposition routine for square matrices, as the minimum swaps is always n * (n - 1)/2.

Charlie S
  • 305
  • 1
  • 2
  • 7