0

I am trying my hand at using texture memory in CUDA. I wrote a simple code to add numbers using a 2D texture. There is a for loop that repeats the kernal multiple times. But weirdly enough, it looks like the texture cache is being flushed only every two kernal launch or so.

The kernal is simply

_global__ void add(float *f, float *fn){
int y = threadIdx.x;
int x = blockIdx.x;

float a = tex2D(text,x,y);
if (x==1 && y==0){
    printf("The location is : %d %d %d\n", x,y,x+nx*y);
    printf("The first element read through texture is : %f\n", a );
    printf("The first element read through global is : %f\n", f[x+nx*y]);
    printf("Printing to :%p\n", f);
}
a+=a;

fn[x+nx*y] = a; }

The texture variable is defined globally as

texture<float,2> text;

Once the variables are defined and copied to device 1D array cudamemcpy, the texture is bound using

cudaChannelFormatDesc desc = cudaCreateChannelDesc<float>();
gpuErrchk(cudaBindTexture2D(NULL, text, f, desc, nx, ny, sizeof(float)*nx));
checkerr();
text.addressMode[0] = cudaAddressModeWrap;
text.addressMode[1] = cudaAddressModeWrap;

The main loop is as given below

or (int t=0; t<10; t++){
    cout<<"Iteration : "<<t<<endl;
        add<<<nx,ny>>>(f,fn);
    cudaDeviceSynchronize();
    checkerr();

    cudaMemcpy(h_f,fn,sizeof(float)*nx*ny,cudaMemcpyDeviceToHost);
//checkerr();
    cout<<"In iteration "<<t<<" the texture is pointing to "<<f<<endl;
    swap(f,fn);
for (int i=0; i<nx*ny; i++)
    cout<<h_f[i]<<' ';
cout<<endl;

}

Here the kernal reads from f, and saves the output to fn. Then the swap function simply swaps the pointer after the kernal is complete. The output I am getting is

Iteration : 0
The location is : 1 0 1
The first element read through texture is : 1.000000
The first element read through global is : 1.000000
Printing to :0x500a20000
In iteration 0 the texture is pointing to 0x500a20000
Iteration : 1
The location is : 1 0 1
The first element read through texture is : 1.000000
The first element read through global is : 2.000000
Printing to :0x500a20200
In iteration 1 the texture is pointing to 0x500a20200
Iteration : 2
The location is : 1 0 1
The first element read through texture is : 2.000000
The first element read through global is : 2.000000
Printing to :0x500a20000
In iteration 2 the texture is pointing to 0x500a20000
Iteration : 3
The location is : 1 0 1
The first element read through texture is : 2.000000
The first element read through global is : 4.000000
Printing to :0x500a20200
In iteration 3 the texture is pointing to 0x500a20200

The pointers are swapping without any issues and the access through global memory always gives the correct answer. But it looks to me like the texture cache is not flushed, leading to it fetching the old values.

Has anyone encountered this issue? I am pretty sure its something I am doing wrong. Any help would be very helpful

  • 1
    since the texture is bound to `f` by value and not by reference, your pointer swapping is having no effect on what the texture is bound to. The first time your kernel is run, the update to `fn` has no effect on the texture. When you then run the kernel the 2nd time, although you have swapped pointers, the texture is still pointing to the previous `f` array which was not modified by the previous kernel call. Therefore in the 2nd kernel iteration, the texture value appears to be the same, because it is. You have a logical code design flaw, if you expecting texture changes at each iteration – Robert Crovella Nov 10 '20 at 14:13
  • Ah that makes a lot of sense. Thanks for the comment. For my application, I need to write the output back to the texture each iteration. Is there a way to bind the texture by reference? Otherwise, the only option I see is to copy the memory itself. – Githin Tom Nov 10 '20 at 15:08
  • I was able to fix this issue by using two textures and toggling between each for read and write with each iteration. This seems to fix the issue. Thanks for the help – Githin Tom Nov 10 '20 at 16:32
  • You might want to investigate CUDA [surfaces](https://stackoverflow.com/questions/59899751/memset-cuarray-for-surface-memory/59904496#59904496). They are designed to be written to. Writing to an underlying texture resource from a CUDA kernel that is using that CUDA texture is considered to result in undefined behavior. Also, you may want to switch to texture/surface **objects** rather than the **reference** method you are using now. See [here](https://developer.nvidia.com/blog/cuda-pro-tip-kepler-texture-objects-improve-performance-and-flexibility/). – Robert Crovella Nov 10 '20 at 16:37
  • Surfaces do seem to be exactly what I'm looking for. I'll try to implement that. Thanks a lot for the help and info. – Githin Tom Nov 10 '20 at 23:38

1 Answers1

1

The problem here is that the texture, once bound, does not alternate between the two pointers, when you are doing your pointer swapping.

Since the texture is bound to f by value and not by reference, your pointer swapping is having no effect on what the texture is bound to. The first time your kernel is run, the update to fn has no effect on the texture. When you then run the kernel the 2nd time, although you have swapped pointers, the texture is still pointing to the previous f array which was not modified by the previous kernel call. Therefore in the 2nd kernel iteration, the texture value appears to be the same, because it is. You have a logical code design flaw, if you expecting texture changes at each iteration.

One possible "fix" could be to rebind the texture at each step.

However, I wouldn't suggest this kind of "fix". Writing to the bound linear memory associated with a texture from a kernel can give rise to undefined behavior. A texture is just a memory region with an interposing cache. The texture memory is expected to be read-only from the kernel code, so that the cache is always coherent. If you write to the underlying memory, the cache may be no longer coherent.

For situations where you wish to "write to a texture" from kernel code, the recommended approach is to use surfaces. There are CUDA sample codes that demonstrate this.

Also, the general recommendation is to switch from using bound texture (or surface) references to bindless texture (or surface) objects, as discussed in this blog.

Robert Crovella
  • 143,785
  • 11
  • 213
  • 257