3

I have a vector of size n; n is power of 2. I need to treat this vector as a matrix n = R*C. Then I need to transpose the matrix.

For example, I have vector: [1,2,3,4,5,6,7,8]

I need to find R and C. In this case it would be: 4,2. And treat vector as matrix:

[1,2]
[3,4]
[5,6]
[7,8]

Transpose it to:

[1, 3, 5, 7]
[2, 4, 6, 8]

After transposition vector should be: [1, 3, 5, 7, 2, 4, 6, 8]

Is there existing algorithms to perform in-place non-square matrix transposition? I don't want to reinvent a wheel.

My vector is very big so I don't want to create intermediate matrix. I need an in-place algorithm. Performance is very important.

  • All modofications should be done in oroginal vector. Ideally algorithm should work with chunks that will fit in CPU cache.
  • I can't use iterator because of memory locality. So I need real transposition.
  • It does not matter if matrix would be 2x4 or 4x2
user1209304
  • 408
  • 1
  • 5
  • 26

3 Answers3

0

Since you haven't provided us with any of your code, can I suggest a different approach (that I don't know will work for your particular situation)?

I would use an algorithm based on your matrix to transpose your values into the new matrix yourself. Since performance is an issue this will help even more so since you don't have to create another matrix. If this is applicable for you.

Have a vector

[1, 2, 3, 4, 5, 6, 7, 8]

Create your matrix

[1, 2]
[3, 4]
[5, 6]
[7, 8]

Reorder vector without another matrix

[1, 3, 5, 7, 2, 4, 6, 8]

Overwrite the values in the current matrix (so you don't have to create a new one) and reorder the values based on your current matrix.

Add values in order

R1 and C1 to transposed_vector[0]
R2 and C1 to transposed_vector[1]
R3 and C1 to transposed_vector[2]
R4 and C1 to transposed_vector[3]
R1 and C2 to transposed_vector[4]

And so on.

0

The problem can be divided in two parts. First, find R and C and then, reshape the matrix. Here is something I would try to do:

Since n is a power of 2, i.e. n = 2^k then if k is even, we have: R=C=sqrt(n). And if k is odd, then R = 2^((k+1)/2) and C=2^((k-1)/2).

Note: Since you mentioned you want to avoid using extra memory, I have made some editions to my original answer.

The code to calculate R and C would be something like:

void getRandC(const size_t& n, size_t& R, size_t& C)
{
    int k = (int)log2(double(n)),
        i, j;

    if (k & 1)  // k is odd
        i = (j = (k + 1) / 2) - 1;
    else
        i = j = k / 2;

    R = (size_t)exp2(i);
    C = (size_t)exp2(j);
}

Which needs C++11. For the second part, in case you want to keep the original vector:

void transposeVector(const std::vector<int>& vec, std::vector<int>& mat)
{
    size_t R, C;
    getRandC(vec.size(), R, C);

    // first, reserve the memory
    mat.resize(vec.size());

    // now, do the transposition directly
    for (size_t i = 0; i < R; i++)
    {
        for (size_t j = 0; j < C; j++)
        {
            mat[i * C + j] = vec[i + R * j];
        }
    }
}

And, if you want to modify the original vector and avoid using extra memory, you can write:

void transposeInPlace(std::vector<int>& vec)
{
    size_t R, C;
    getRandC(vec.size(), R, C);

    for (size_t j = 0; R > 1; j += C, R--)
    {
        for (size_t i = j + R, k = j + 1; i < vec.size(); i += R)
        {
            vec.insert(vec.begin() + k++, vec[i]);
            vec.erase(vec.begin() + i + 1);
        }
    }
}

See the live example

polfosol ఠ_ఠ
  • 1,840
  • 26
  • 41
  • I like idea how to calculate R and C. But for transposition it require additional memory. I need to do modifications in oroginal vector without creating intermediate result. – user1209304 Apr 20 '16 at 19:23
  • That was a quite good challenge. This code is now really fast and small and seems to meet your goals @user1209304 – polfosol ఠ_ఠ Apr 20 '16 at 21:40
  • Thanks for solution. Functionally it works correct but on big size vectors like 4194304 it works terribly slow. I stopped application after 1 min. – user1209304 Apr 21 '16 at 08:32
  • That is because of the fact that inserting an element into a big vector and erasing another are time-consuming operations, and most of the times, we need to set a trade-off between memory usage and calculation speed. By the way, it seems that swapping the iterators would need less time compared to removing and attaching the elements. But if you want to use the `iter_swap` function from `` header, the procedure would be much more complicated and sophisticated. Although I didn't give it much thought – polfosol ఠ_ఠ Apr 21 '16 at 12:29
0

For non square matrix representation, I think it may be tricky, and not worth the effort to make the transpose of your flat vector without creating another one. Here is a snippet of what I came up with:

chrono::steady_clock::time_point start = chrono::steady_clock::now();
int i, j, p, k;
vector<int> t_matrix(matrix.size());
for(k=0; k< R*C ;++k)
{
    i = k/C;
    j = k - i*C;
    p = j*R + i;
    t_matrix[p] = matrix[k];
}
cout << chrono::duration_cast<chrono::milliseconds> chrono::steady_clock::now() - start).count() << endl;

Here, matrix is your flat vector, t_matrix is the "transposed" flat vector, and R and C are, respectively rows and vector you found for your matrix representation.

Carafini
  • 111
  • 4
  • I understand that inplcae transposition could be slower. I need to do inplace transposition because my vector is huge and I can't afford to hold 2 vectors in memory. – user1209304 Apr 21 '16 at 08:51