I am trying my hand at using texture memory in CUDA. I wrote a simple code to add numbers using a 2D texture. There is a for loop that repeats the kernal multiple times. But weirdly enough, it looks like the texture cache is being flushed only every two kernal launch or so.
The kernal is simply
_global__ void add(float *f, float *fn){
int y = threadIdx.x;
int x = blockIdx.x;
float a = tex2D(text,x,y);
if (x==1 && y==0){
printf("The location is : %d %d %d\n", x,y,x+nx*y);
printf("The first element read through texture is : %f\n", a );
printf("The first element read through global is : %f\n", f[x+nx*y]);
printf("Printing to :%p\n", f);
}
a+=a;
fn[x+nx*y] = a; }
The texture variable is defined globally as
texture<float,2> text;
Once the variables are defined and copied to device 1D array cudamemcpy, the texture is bound using
cudaChannelFormatDesc desc = cudaCreateChannelDesc<float>();
gpuErrchk(cudaBindTexture2D(NULL, text, f, desc, nx, ny, sizeof(float)*nx));
checkerr();
text.addressMode[0] = cudaAddressModeWrap;
text.addressMode[1] = cudaAddressModeWrap;
The main loop is as given below
or (int t=0; t<10; t++){
cout<<"Iteration : "<<t<<endl;
add<<<nx,ny>>>(f,fn);
cudaDeviceSynchronize();
checkerr();
cudaMemcpy(h_f,fn,sizeof(float)*nx*ny,cudaMemcpyDeviceToHost);
//checkerr();
cout<<"In iteration "<<t<<" the texture is pointing to "<<f<<endl;
swap(f,fn);
for (int i=0; i<nx*ny; i++)
cout<<h_f[i]<<' ';
cout<<endl;
}
Here the kernal reads from f, and saves the output to fn. Then the swap function simply swaps the pointer after the kernal is complete. The output I am getting is
Iteration : 0
The location is : 1 0 1
The first element read through texture is : 1.000000
The first element read through global is : 1.000000
Printing to :0x500a20000
In iteration 0 the texture is pointing to 0x500a20000
Iteration : 1
The location is : 1 0 1
The first element read through texture is : 1.000000
The first element read through global is : 2.000000
Printing to :0x500a20200
In iteration 1 the texture is pointing to 0x500a20200
Iteration : 2
The location is : 1 0 1
The first element read through texture is : 2.000000
The first element read through global is : 2.000000
Printing to :0x500a20000
In iteration 2 the texture is pointing to 0x500a20000
Iteration : 3
The location is : 1 0 1
The first element read through texture is : 2.000000
The first element read through global is : 4.000000
Printing to :0x500a20200
In iteration 3 the texture is pointing to 0x500a20200
The pointers are swapping without any issues and the access through global memory always gives the correct answer. But it looks to me like the texture cache is not flushed, leading to it fetching the old values.
Has anyone encountered this issue? I am pretty sure its something I am doing wrong. Any help would be very helpful