On most systems, the host and the device do not share physical memory, the CPU might use RAM and the GPU might use its own global memory. SYCL needs to know which data it will be sharing between the host and the devices.
For this purpose, SYCL uses its buffers, the buffer class is generic over the element type and the number of dimensions. When passed a raw pointer, the buffer(T* ptr, range size) constructor takes ownership of the memory it has been passed. This means that we absolutely cannot use that memory ourselves while the buffer exists, which is why we begin a C++ scope. At the end of their scope, the buffers will be destroyed and the memory returned to the user. A size argument is a range object, which has to have the same number of dimensions as the buffer and is initialized with the number of elements in each dimension. Here, we have one dimension with one element.
Buffers are not associated with a particular queue or context, so they are capable of handling data transparently between multiple devices.
Accessors are used to access request control over the device memory from the buffer objects. Their modes will take care of data movement between host and device. So we need not have to explicitly copy back the result from device to host.
Below is the example for more clarification:
#include <bits/stdc++.h>
#include <CL/sycl.hpp>
using namespace std;
class vector_addition;
int main(int, char**) {
//creating host memory
int *a=(int *)malloc(10*sizeof(int));
int *b=(int *)malloc(10*sizeof(int));
int *c=(int *)malloc(10*sizeof(int));
for(int i=0;i<10;i++){
a[i]=i;
b[i]=10-i;
}
cl::sycl::default_selector device_selector;
cl::sycl::queue queue(device_selector);
std::cout << "Running on "<< queue.get_device().get_info<cl::sycl::info::device::name>()<< "\n";
{
//creating buffer from pointer of host memory
cl::sycl::buffer<int, 1> a_sycl{a, cl::sycl::range<1>{10} };
cl::sycl::buffer<int, 1> b_sycl{b, cl::sycl::range<1>{10} };
cl::sycl::buffer<int, 1> c_sycl{c, cl::sycl::range<1>{10} };
queue.submit([&] (cl::sycl::handler& cgh) {
//creating accessor of buffer with proper mode
auto a_acc = a_sycl.get_access<cl::sycl::access::mode::read>(cgh);
auto b_acc = b_sycl.get_access<cl::sycl::access::mode::read>(cgh);
auto c_acc = c_sycl.get_access<cl::sycl::access::mode::write>(cgh);//responsible for copying back to host memory
//kernel for execution
cgh.parallel_for<class vector_addition>(cl::sycl::range<1>{ 10 }, [=](cl::sycl::id<1> idx) {
c_acc[idx] = a_acc[idx] + b_acc[idx];
});
});
}
for(int i=0;i<10;i++){
cout<<c[i]<<" ";
}
cout<<"\n";
return 0;
}