0

I have a C++ class that provides bindings to OpenCL. OpenCL functions are exposed through the pointers to callable objects of type cl::make_kernel. To use respective object, I have to create this pointer in my class.

Due to the OpenCL design I can not create the cl::make_kernel object until I initialize OpenCL engine by function let's call it openCLCompile that provides me cl::Program object required to construct cl::make_kernel.

Let's say that I would like to expose two OpenCL function that is each in separate file. The function openCLCompile accepts a vector of all files and can be called only once. Normally my class would look as follows:

    class OpenCLWrapper
    {
    public:
         initializeOpenCL()
         {
              std::vector<std::string> CLfiles;
              CLFiles.push_back("functionA.cl");
              CLFiles.push_back("functionB.cl");
              cl::Program program = openCLCompile(CLfiles);
              CLfunctionA = std::make_shared<cl::make_kernel<cl::Buffer&>>(cl::Kernel(program, "FUNCTION1"));
              CLfunctionB = std::make_shared<cl::make_kernel<cl::Buffer&,  cl_double16&>>(cl::Kernel(program, "FUNCTION2"));
         }
    private:
         std::shared_ptr<cl::make_kernel<cl::Buffer&>> CLfunctionA;
         std::shared_ptr<cl::make_kernel<cl::Buffer&, cl_double16&>> CLfunctionB;
    }

However by using this logic, I must always compile both "functionA.cl" and "functionB.cl" files and create both pointers CLfunctionA and CLfunctionB.

I would like to have dedicated functions to register openCL function that is separate for both functions. In this way it is possible to leave second pointer nullptr. However when I call this register function before I call initializeOpenCL, I don't have cl::Program object to create respective cl::make_kernel objects. It would be nice however to specify how to initialize the respective object in scope of register function by means of code that will change member variable. So I decided to solve with the following logic

    class OpenCLWrapper
    {
    public:
         registerFunctionA()
         {
               CLFiles.push_back("functionA.cl");
               std::function<void(cl::Program)> f = [this](cl::Program program) {
                    CLfunctionA = std::make_shared<cl::make_kernel<cl::Buffer&>>(cl::Kernel(program, "FUNCTION1"));
               };
               callbacks.push_back(f);
         }

         registerFunctionB()
         {
              CLFiles.push_back("functionB.cl");
              std::function<void(cl::Program)> f = [this](cl::Program program) 
              {
                   CLfunctionB = std::make_shared<cl::make_kernel<cl::Buffer&,  cl_double16&>>(cl::Kernel(program, "FUNCTION2"));
              };
              callbacks.push_back(f);
         }

         initializeOpenCL()
         {
               cl::Program p = openCLCompile(CLfiles);
               for(std::function<void(cl::Program)> f : callbacks)
               {
                    f(p);
               }
         }
    private:
         std::shared_ptr<cl::make_kernel<cl::Buffer&>> CLfunctionA;
         std::shared_ptr<cl::make_kernel<cl::Buffer&, cl_double16&>> CLfunctionB;
         std::vector<std::string> CLfiles;
         std::vector<std::function<void(cl::Program)>> callbacks;
    }

It seems working but I want to know if this is OK implementation not leaking memory or I should have implemented these callbacks in other way.

VojtaK
  • 483
  • 4
  • 13

1 Answers1

0

Overall it looks good to me, I don't believe there any memory leaks. I would recommend this small change that serves mostly to save you some typing.

callbacks.push_back([this](cl::Program program){ ... });

instead of

std::function<void(cl::Program)> f = [this](cl::Program program) { ... };
callbacks.push_back(f);

Side note:

It is generally preferable to use std::vector::emplace_back(...) rather than std::vector::push_back(...) for reasons listed here

And the auto keyword may be helpful for you as it could make

for(std::function<void(cl::Program)> f : callbacks)

into

for(auto f : callbacks)

which, at least in my opinion, is easier to read, easier to type, and requires less thought about exactly what the name of the type you're dealing with is