I have a 3d array containing the non-cubic bounding box of a sparse geometry.
The array geometry[x][y][z] contains the value 0 if (x,y,z) is part of the computational domain and otherwise 1.
In an attempt to reorder computations I would like to traverse this space using a Hilbert Curve.
The context is optimizing global memory access in a memory-bound GPU program.
How can I implement this?
Update: I just want to traverse the non-empty cells, as I will only store those (in an array) together with an adjacency list which keeps track of an element's 19 neighboring nodes.
The computation is simply copying between two arrays:
dst[i] = src[adjacency_map[i]]
This is the propagation phase of a sparse Lattice Boltzmann method, where the physical interpretation is streaming 'fluid particles' from a neighboring site.
The more sequential the values in adjacency_map are; the more coalesced memory accesses we hopefully get.
OpenCL kernel:
__kernel void propagation(__global double *dst, __global double *source,
__global const int *adjacency_map, const uint max_size)
{
size_t l = get_global_id(0);
if( l > max_size )
return;
dst[l] = src[adjacency_map[l]];
}