0

Hello i got some problem with allocating memory for 2d matrix. I do something like that

float *tempMatrix=(float *)malloc(size * size * sizeof (float));

using this code i make 2 matrices with i later try to multiply with this code:

for (xDim=0;xDim<matrixSize;xDim++){
        for(yDim=0;yDim<matrixSize;yDim++){
            for (i=0;i<matrixSize;i++){
                result+=matrixA[xDim*matrixSize +i]*matrixB[(i)*matrixSize+yDim];
            }   
            resultMatrix[xDim*matrixSize+yDim]=result;
            result=0;
        }
    }

And i want to do it on many Threads using OpenMP but the problem is when i set matrix size to be greater than 1000 the program is hanging and not execute at all. My guess it is because there is problem with memo allocation but i am not 100% sure. Can anyone clarify what is wrong here or how to improve the code?

Thanks in advance

Edit#1

here is my code with OpenMP directives:

            #pragma omp parallel shared(matrixA,matrixB,resultMatrix,matrixSize) private(i,result,xDim,yDim) \
            num_threads(threadNo)
            {

            #pragma omp single
    {
        printf("Number of threads %d.\n",omp_get_num_threads());
    }
            #pragma omp for schedule(guided,1)
            for (xDim=0;xDim<matrixSize;xDim++){
                    for(yDim=0;yDim<matrixSize;yDim++){
                            for (i=0;i<matrixSize;i++){
                                    result+=matrixA[xDim][i]*matrixB[i]yDim];
                            }
                            resultMatrix[xDim][yDim]=result;
                            result=0;
                    }
            }
    }

Edit#2 I change the initialization of matrices to be in single block as suggested

Rodri
  • 41
  • 4
  • Use a debugger to see where exactly it is hanging. Add checks that `malloc()` is not returning NULL. Also, are you using C, or C++? – John Zwinck Apr 26 '16 at 06:13
  • 1
    Where does it hang exactly? What's the last thing it does? How many matrices do you try to create? How much memory do you have? – David Schwartz Apr 26 '16 at 06:13
  • A failure due to allocation would cause a crash, not a hang. Is your program already parallelized ? –  Apr 26 '16 at 06:15
  • @jdarthenay There's some evidence this is C++ (the casting of the result of `malloc`), so why would you assume it's C? – David Schwartz Apr 26 '16 at 06:15
  • Since the program hangs, I doubt it to be a memory problem related to malloc. That would be more likely to result in a crash. – Support Ukraine Apr 26 '16 at 06:15
  • I removed C++ tag because your are `malloc()` and tagged malloc. If you are doing C++ use `new`, not `malloc()`. – jdarthenay Apr 26 '16 at 06:16
  • You can easily check if any allocation failed, because `malloc` will return `NULL`. Any hang might be related to the subsequent **O(N^3)** multiplication algorithm. You should show a little more code from your program so we can comment on whether you're losing values or not initialising them. – paddy Apr 26 '16 at 06:16
  • 6
    Every time I see a matrix modelled with double pointers I die a little inside. Use a single contiguous block for all but the largest structures. – Bathsheba Apr 26 '16 at 06:18
  • @DavidSchwartz Well, It may be considered bad practice to cast `malloc()` in C, but it's also bad practise to use `malloc()` in C++. – jdarthenay Apr 26 '16 at 06:18
  • Does the hang occur before you try to use OpenMP or only after? – Michael Burr Apr 26 '16 at 06:18
  • You said you use OpenMP for parallelisation. However, you don't show your directives, and there are plenty of opportunity to get it wrong (forgetting to declare `yDim` and `i` `private`, or `result` as `private` or `reduction(+:)` depending on the circumstances, etc). Please add your actual directives as they might very well be at the origin of your issue. – Gilles Apr 26 '16 at 06:33
  • 1
    We won't be able to re-produce the problem with the code posted. You should however strongly consider using a 2D array instead, since this code will be awfully slow, particularly if called from multiple threads. – Lundin Apr 26 '16 at 06:37
  • Why don't you use a single block per matrix instead of a separate block for each row – M.M Apr 26 '16 at 07:31
  • @Lundin So how should such code looks like or how to speed up my code so it will be able to operate on 2D arrays of size [10k][10k]? – Rodri Apr 26 '16 at 08:34
  • @Rodri http://stackoverflow.com/questions/12462615/how-do-i-correctly-set-up-access-and-free-a-multidimensional-array-in-c – Lundin Apr 26 '16 at 08:39
  • How long did you wait? You seem do be doing a matrix multiplication; for size = 1000 the inner loop executes a billion time, and your code isn't exactly cache friendly (you are doing two pointer access and reading two floats on each iteration), so this may take take tens of seconds. For size = 2000 it will take eight times longer. – gnasher729 Apr 26 '16 at 06:18

0 Answers0