4

I said in this question that I had some problem loading ptx modules in JCuda and after @talonmies's idea, I implemented a JCuda version of his solution to load multiple ptx files and load them as a single module. Here is the related part of the code:

import static jcuda.driver.JCudaDriver.cuLinkAddFile;
import static jcuda.driver.JCudaDriver.cuLinkComplete;
import static jcuda.driver.JCudaDriver.cuLinkCreate;
import static jcuda.driver.JCudaDriver.cuLinkDestroy;
import static jcuda.driver.JCudaDriver.cuModuleGetFunction;
import static jcuda.driver.JCudaDriver.cuModuleLoadData;

import jcuda.driver.CUjitInputType;
import jcuda.driver.JITOptions;
import jcuda.driver.CUlinkState;
import jcuda.driver.CUfunction;

public class JCudaTestJIT{

    private CUmodule module;
    private CUfunction functionKernel;

    public void prepareModule(){
        String ptxFileName4 = "file4.ptx";
        String ptxFileName3 = "file3.ptx";
        String ptxFileName2 = "file2.ptx";
        String ptxFileName1 = "file1.ptx";

        CUlinkState linkState = new CUlinkState();
        JITOptions jitOptions = new JITOptions();
        cuLinkCreate(jitOptions, linkState);

        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName4, jitOptions);
        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName3, jitOptions);
        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName2, jitOptions);
        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName1, jitOptions);

        long sizeOut = 32768;
        byte[] image = new byte[32768];

        Pointer cubinOut = Pointer.to(image);

        cuLinkComplete(linkState, cubinOut, (new long[]{sizeOut}));

        module = new CUmodule();

        // Load the module from the image buffer
        cuModuleLoadData(module, cubinOut.getByteBuffer(0, 32768).array());

        cuLinkDestroy(linkState);

        functionKernel = new CUfunction();
        cuModuleGetFunction(functionKernel, module, "kernel");
    }

    // Other methods 
}

But I got the error of CUDA_ERROR_INVALID_IMAGE at calling cuModuleLoadData method. While debugging it, I saw that after calling cuLinkComplete method and pass the image array as the output, the array is still unchanged and clear. Am I passing the output parameter correctly? Is this how one can pass a variable by reference in JCuda?

Community
  • 1
  • 1
AmirSojoodi
  • 1,080
  • 2
  • 12
  • 31
  • Your input files have `.cu` extensions. If they are CUDA C source, they can't be JIT compiled or linked. Only PTX code or precompiled binary objects can be linked at runtime by the driver. – talonmies Sep 12 '15 at 08:01
  • @talonmies : Yeah they are PTX code. It was a typing mistake in the question. – AmirSojoodi Sep 12 '15 at 09:15

1 Answers1

4

I had never written a single line of Java code until 30 minutes ago, let alone used JCUDA before, but an almost literal line-by-line translation of the native C++ code I gave you here seems to work perfectly:

import static jcuda.driver.JCudaDriver.*;
import java.io.*;
import jcuda.*;
import jcuda.driver.*;

public class JCudaRuntimeTest
{
    public static void main(String args[])
    {
        JCudaDriver.setExceptionsEnabled(true);

        cuInit(0);
        CUdevice device = new CUdevice();
        cuDeviceGet(device, 0);
        CUcontext context = new CUcontext();
        cuCtxCreate(context, 0, device);

        CUlinkState linkState = new CUlinkState();
        JITOptions jitOptions = new JITOptions();
        cuLinkCreate(jitOptions, linkState);

        String ptxFileName2 = "test_function.ptx";
        String ptxFileName1 = "test_kernel.ptx";

        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName2, jitOptions);
        cuLinkAddFile(linkState, CUjitInputType.CU_JIT_INPUT_PTX, ptxFileName1, jitOptions);

        long sz[] = new long[1];
        Pointer image = new Pointer();
        cuLinkComplete(linkState, image, sz);
        System.out.println("Pointer: " + image);
        System.out.println("CUBIN size: " + sz[0]);

        CUmodule module = new CUmodule();
        cuModuleLoadDataEx(module, image, 0, new int[0], Pointer.to(new int[0]));   
        cuLinkDestroy(linkState);

        CUfunction functionKernel = new CUfunction();
        String kernelname = "_Z6kernelPfS_S_S_";
        cuModuleGetFunction(functionKernel, module, kernelname);
        System.out.println("Function: " + functionKernel);
    }
}

which works like this:

> nvcc -ptx -arch=sm_21 test_function.cu
test_function.cu

> nvcc -ptx -arch=sm_21 test_kernel.cu
test_kernel.cu

> javac -cp ".;jcuda-0.7.0a.jar" JCudaRuntimeTest.java
> java -cp ".;jcuda-0.7.0a.jar" JCudaRuntimeTest
Pointer: Pointer[nativePointer=0xa5a13a8,byteOffset=0]
CUBIN size: 5924
Function: CUfunction[nativePointer=0xa588160]

The key here seems to be to use cuModuleLoadDataEx, noting that the return values from cuLinkComplete are a system pointer to the linked CUBIN and the size of the image returned as a long[]. As per the C++ code, the pointer is just passed directly to the module data load.

As a final comment, it would have been much simpler and easier if you had posted a proper repro case that could be been directly hacked on, rather than making me learn the rudiments of JCUDA and Java before I could create a useful repro case and get it to work. The documentation for JCUDA is basic, but complete, and against the working C++ example already provided, it only took a couple of minutes of reading to see how to do this.

Community
  • 1
  • 1
talonmies
  • 70,661
  • 34
  • 192
  • 269
  • You left me nothing to say.. :) You are the best... I mean it.. "I had never written a single line of Java code until 30 minutes ago, let alone used JCUDA before.." This actually made me so sensational and excited..! I tested and run it in the framework and everything is fine.. :) And you are right about documentations in JCuda, I don't know why I didn't use this one, maybe because it was said that "it is hardly possible to properly pass in the required option values for this method." and made me desperate about it. One more time, talonmies, thank you... :) – AmirSojoodi Sep 12 '15 at 11:41
  • Indeed, great work! (I'm the JCuda guy, and wasn't sure whether this was possible either (and couldn't test it ATM) :-o). Great to see that you could help here so quickly and got it working! – Marco13 Sep 12 '15 at 13:03