Divide and Conquer In-place Transpose of a Matrix

Question

I'm working on an implementation of the approach described in the wiki article for the in-place cache-oblivious transposition of a square Matrix.

https://en.wikipedia.org/wiki/In-place_matrix_transposition

The algorithm basically recursively splits the Matrix into four, then transposes the quadrants which are along the diagonal and swaps the ones that are above and below it. The actual transposition/swapping only occurs if the matrix is of size 2*2 or lower, otherwise, it is split again.

I've split it into three functions:

This initiates the procedure for a given size N:

void SmartTranspose(int A[row][col]) {
    Transpose(A, 0, 0, N, N);
}

Then:

void Transpose(int A[row][col], int x, int y, int w, int h) {
    int Temp;
    if ((w - x) * (h - y) <= 4){
        for (int row1 = x  ; row1 < w -1 ; row1++)
            for (int col1 = y + 1 ; col1 < h ; col1++) {
                Temp = A[row1][col1];
                printf("transp: %d %d\n", A[row1][col1], A[col1] [row1]);
                A[row1][col1]  =  A[col1][row1];
                A[col1][row1]  = Temp;
            }
    }
    else {
        int halfh = h / 2;
        int halfw = w / 2;
        Transpose(A, x, y, halfw , halfh);
        Transpose(A, x + halfw, y + halfh, w , h);
        TransposeSwap(A, x + halfw, y, w, halfh, x, y + halfh, halfw , h);
    }
}

And finally:

void TransposeSwap(int A[row][col], int x, int y, int w, int h,int x1, int y1, int w1, int h1) {
    int Temp; int row2 = x1; int col2 = y1;
    if ((w - x)  * (h - y) <= 4 && (w1 - x1)  * (h1 - y1) <= 4) {
        for(row1 = x; row1 < w; row1++)
            for(col1 = y; col1 < h; col1++)
            {
                Temp = A[row1][col1] ;
                A[row1][col1] = A[col1][row1];
                A[col1][row1] = Temp;
            }
    }
    else {
        printf("RECURSE");
        int halfh = h / 2;
        int halfw = w / 2;
        int halfh1 = h1 / 2;
        int halfw1 = w1 / 2;
        TransposeSwap(A, x, y, halfw, halfh, x1, y1, halfw1, halfh1);
        TransposeSwap(A, x + halfw, y, w, h - halfh, x1, y1 + halfh1, halfw1, h1);
        TransposeSwap(A, x , y + halfh, halfw, h, x1 + halfw1, y1, w1, halfh1);
        TransposeSwap(A, x + halfw, y + halfh, w, h, x1 + halfw1, y1 + halfh1, w1, h1);
    }
}

This does not work, however, and I'm struggling to see where my logic goes wrong here.

Edit: Example of Output

 Original matrix: 
 1948037971 40713922 986050715 74181839 943010147 1060710730 
 18590233 268906808 1966315840 1325423973 398061279 2047858287 
 513589654 1727398080 2016821685 277200601 1611383116 2000671901 
 228038281 1863845528 106517081 1934721636 745170263 1736525254 
 224427632 687572994 1249224754 1497415191 537022734 1443375385 
 1054092341 337577057 1484089307 2040143056 411758897 279615807  

Transposed matrix: 
1948037971 18590233 513589654 74181839 943010147 1060710730 
40713922 268906808 1727398080 1325423973 398061279 2047858287 
986050715 1966315840 2016821685 277200601 1611383116 2000671901 
228038281 1863845528 106517081 1934721636 745170263 1736525254 
224427632 687572994 1249224754 1497415191 537022734 1443375385 
1054092341 337577057 1484089307 2040143056 411758897 279615807

The correct output should be a transposed matrix.

Edit: Main function and declarations:

int row = 40000 , col = 40000;
static int A[40000][40000];
static int N[100] = {0};
void SmartTranspose(int A[row][col]);
void Transpose(int A[row][col], int x, int y, int w, int h);
void InitializeMatrix(int X[row][col]);
void PrintMatrix(int X[row][col]);

double pow(double x, double y);
int matrix = 0;
void TransposeSwap(int A[row][col], int x, int y, int w, int h,int x1, int y1, int w1, int h1);

 int main(){   

 srand(time(NULL));

 double sizes = 0;
 int count = 0;


for(sizes = 20; sizes < 30; sizes++)
{  
  N[count] = floor(pow(2, (sizes/9)));
 printf("N     %d\n", N[count]);
  count++; }


for (matrix = 0; matrix <= count -1 ; matrix++){


InitializeMatrix(A);    
printf("N %d\n",N[matrix]);
printf("\nOriginal matrix: \n");




SmartTranspose(A);

printf("E\n");

printf("\nTransposed matrix: \n");
PrintMatrix(A); 

 }     
return 0;
 }

A link to the complete code: https://jpst.it/QaBq

Could you provide an example of the output you get vs what you should get ? — frostblue, Dec 08 '16 at 14:06
Don't you need to take care in e.g. `halfh = h / 2;` that `h` is even? — Paul Ogilvie, Dec 08 '16 at 14:14
I'm assuming that halfh will be floored, so if h = 5 then halfh = 2, then the other 'half' will be from 2 to 5. Is that problematic? — MHH, Dec 08 '16 at 14:20
Is `int A[row][col]` in the function parameter declaration list the same as `int A[N][N]`? — Ian Abbott, Dec 08 '16 at 14:51
Not exactly, no.I declared a global static int A[40000][40000]; and I'm calling the different matrix sizes every time. — MHH, Dec 08 '16 at 15:05
@MHH. But you do not appear to pass the overall dimensions of the array to any of these functions at all. If you want it to cope with variable length arrays, you need to declare the dimensions. Let's assume square matrices, since a different algorithm will be required for non-square matrices. The dimension(s) should be defined before the array parameter like this: `void SmartTranspose(int dim, int A[dim][dim])` and similarly for the other functions. — Ian Abbott, Dec 08 '16 at 15:15
Ah okay so A[row][col] is actually the dimension of the array, row and col are also declared as globals, which I think was an obvious problem because I was using them to iterate in the function. Fixed that. The results still look pretty strange though. I'm going to add the rest of the code to the post. — MHH, Dec 08 '16 at 15:27
How does the `N` used by `SmartTranspose` relate to the global array `int N[100]`? It might be better if you post a complete .c file. — Ian Abbott, Dec 08 '16 at 15:49
N[matrix] is the size of each matrix. N[100] is the maximum number of matrices over which I iterate to transpose each one. Added it. — MHH, Dec 08 '16 at 15:59
Okay, so the output was somewhat messed up because the print function was also using the the same global variables col and row. Fixed it and added a new better, but still incorrect output. — MHH, Dec 08 '16 at 16:13
So, the problem now actually is with off matrices only, yes. — MHH, Dec 08 '16 at 16:59

score 2 · Accepted Answer · answered Dec 08 '16 at 17:21

Here is my attempt to demonstrate a working algorithm. Since the matrix is square, I have eliminated a few function parameters. Also, I have left in some debugging code to show the progress of the algorithm at each recursion level.

It can transpose any sub-matrix whose diagonal lies on the main diagonal. I tested with a 9x9 matrix in the 0,0 corner of a 100x100 array.

#include <stdio.h>

int dbglvl;

void TransposeSwap(int dim, int A[dim][dim], int rs, int cs, int re, int ce) {
    int Temp;

    for (Temp = 0; Temp < dbglvl; Temp++) {
        putchar('>');
    }
    printf("TransposeSwap(dim=%d, rs=%d, cs=%d, re=%d, ce=%d)\n", dim, rs, cs, re, ce);
    if (re - rs <= 2 && ce - cs <= 2) {
        for (int r = rs; r < re; r++)
            for (int c = cs; c < ce; c++)
            {
                printf("transp %d %d: %d %d\n", r, c, A[r][c], A[c][r]);
                Temp = A[r][c] ;
                A[r][c] = A[c][r];
                A[c][r] = Temp;
            }
    }
    else {
        int rm = (rs + re) / 2;
        int cm = (cs + ce) / 2;
        dbglvl++;
        TransposeSwap(dim, A, rs, cs, rm, cm);
        TransposeSwap(dim, A, rm, cs, re, cm);
        TransposeSwap(dim, A, rs, cm, rm, ce);
        TransposeSwap(dim, A, rm, cm, re, ce);
        dbglvl--;
    }
    for (Temp = 0; Temp < dbglvl; Temp++) {
        putchar('<');
    }
    printf("TransposeSwap\n");
}

void Transpose(int dim, int A[dim][dim], int s, int e) {
    int Temp;

    for (Temp = 0; Temp < dbglvl; Temp++) {
        putchar('>');
    }
    printf("Transpose(dim=%d, s=%d, e=%d)\n", dim, s, e);
    if (e - s <= 2) {
        for (int r = s; r < e - 1 ; r++)
            for (int c = s + 1 ; c < e ; c++) {
                printf("transp %d %d: %d %d\n", r, c, A[r][c], A[c][r]);
                Temp = A[r][c];
                A[r][c]  =  A[c][r];
                A[c][r]  = Temp;
            }
    }
    else {
        int m = (s + e) / 2;
        dbglvl++;
        Transpose(dim, A, s, m);
        Transpose(dim, A, m, e);
        TransposeSwap(dim, A, m, s, e, m);
        dbglvl--;
    }
    for (Temp = 0; Temp < dbglvl; Temp++) {
        putchar('<');
    }
    printf("Transpose\n");
}

void Dump(int dim, int A[dim][dim], int rs, int cs, int re, int ce) {
    int r, c;

    for (r = rs; r < re; r++) {
        for (c = cs; c < ce; c++) {
            printf("%d ", A[r][c]);
        }
        putchar('\n');
    }
}

#define N 100

int test[N][N] = {
    { 11, 12, 13, 14, 15, 16, 17, 18, 19 },
    { 21, 22, 23, 24, 25, 26, 27, 28, 29 },
    { 31, 32, 33, 34, 35, 36, 37, 38, 39 },
    { 41, 42, 43, 44, 45, 46, 47, 48, 49 },
    { 51, 52, 53, 54, 55, 56, 57, 58, 59 },
    { 61, 62, 63, 64, 65, 66, 67, 68, 69 },
    { 71, 72, 73, 74, 75, 76, 77, 78, 79 },
    { 81, 82, 83, 84, 85, 86, 87, 88, 89 },
    { 91, 92, 93, 94, 95, 96, 97, 98, 99 },
};

int main(void) {
    puts("Original:");
    Dump(N, test, 0, 0, 9, 9);
    putchar('\n');
    dbglvl = 1;
    Transpose(N, test, 0, 9);
    putchar('\n');
    puts("Transposed:");
    Dump(N, test, 0, 0, 9, 9);
    return 0;
}

This is very well thought out and organized. Puts my code to shame. I think I can spot where the issue is in mine. — MHH, Dec 08 '16 at 17:43

Divide and Conquer In-place Transpose of a Matrix

1 Answers1