0

I am facing a problem with my simple kernel for complex matrix multiplication, here it is:

__kernel void myGEMMcomplex(__global double2* A, __global double2* B,__global double2* C, int rowsB, int colsB, int rowsA)
{
    int globalRow = get_global_id(0); 
    int globalCol = get_global_id(1);

    Complex acc;
    acc.x = 0;
    acc.y = 0;

    for (int k=0; k< rowsB; k++) {
         acc += multiply(A[k*rowsA + globalRow],B[globalCol*rowsB + k]);
    }

    C[globalCol*rowsA + globalRow] = acc;
}

I can compute quite big matrices with this kernel (like 7000x7000) when I am using my Intel HD 4000, however, with my NVIDIA GF 720M it can calculate multiplication only of small 1000x1000 matrices. With bigger matrices it throws exception:

`Exception in thread "main"com.nativelibs4java.opencl.CLException$OutOfResources: OutOfResources (make sure to log all errors with environment variable CL_LOG_ERRORS=stdout)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at java.lang.Class.newInstance(Class.java:442)
at com.nativelibs4java.opencl.CLException.error(CLException.java:308)
at com.nativelibs4java.opencl.CLBuffer.read(CLBuffer.java:453)
at com.nativelibs4java.opencl.CLBuffer.read(CLBuffer.java:383)
at com.nativelibs4java.opencl.CLBuffer.read(CLBuffer.java:211)
at com.mycompany.mavenproject1.JavaCLTutorial1.main(JavaCLTutorial1.java:91)`

As well as it gives warning on my computer screen: "Display drivers stopped responding and has recovered". I am using integrated GPU for my display when making calculations with geForce, but I haven't noticed significant benefits of this. What is more, I also have developed matrix inversion kernel and while I can compute inversion of 2000x2000 and bigger matrices when using intel HD, my GPU can only compute inversion of 200x200 matrices....

Maybe someone could explain what is the problem, why my GPU is less stable than integrated intel hd and is there any way to solve this problem?

talonmies
  • 70,661
  • 34
  • 192
  • 269
  • The NVIDIA WDDM driver imposes a TDR timer limit on how long your GPU can be occupied by a compute job before it must either yield or be killed. That is the source of the driver reset you are seeing. – talonmies Jan 02 '16 at 21:25
  • Therefore, this means that Intel HD 4000 does not have this limit and it is safe to arrange TDR? – user3173452 Jan 02 '16 at 21:44
  • No Intel also has this limit (it is imposed by the Windows WDDM system -- every display driver has this TDR limit). But the value could be different. See [Intel's release notes](https://software.intel.com/en-us/articles/intel-sdk-for-opencl-applications-2013-release-notes). But are you actually sure that when you run on the Intel device that it is the HD4000 GPU and not the CPU itself? Intel ships both ICDs in their OpenCL SDK. The CPU OpenCL device obviously has no TDR limitations – talonmies Jan 02 '16 at 21:54
  • From my experience, with HD4000 it calculates at least 10 times faster when matrices are 1000x1000 and bigger, than with CPU. However, by increasing TDR value I still have com.nativelibs4java.opencl.CLException$OutOfResources: OutOfResources exception. On the other hand, now I do not have "Display drivers stopped responding and has recovered" warning. Maybe I will try to search for information about this exception. – user3173452 Jan 02 '16 at 22:05

0 Answers0