3

I have a 3d array containing the non-cubic bounding box of a sparse geometry.

The array geometry[x][y][z] contains the value 0 if (x,y,z) is part of the computational domain and otherwise 1.

In an attempt to reorder computations I would like to traverse this space using a Hilbert Curve.

The context is optimizing global memory access in a memory-bound GPU program.

How can I implement this?

Update: I just want to traverse the non-empty cells, as I will only store those (in an array) together with an adjacency list which keeps track of an element's 19 neighboring nodes.

The computation is simply copying between two arrays:

dst[i] = src[adjacency_map[i]]

This is the propagation phase of a sparse Lattice Boltzmann method, where the physical interpretation is streaming 'fluid particles' from a neighboring site.

The more sequential the values in adjacency_map are; the more coalesced memory accesses we hopefully get.

OpenCL kernel:

__kernel void propagation(__global double *dst, __global double *source,
                          __global const int *adjacency_map, const uint max_size)
{
    size_t l = get_global_id(0);

    if( l > max_size ) 
        return;

    dst[l] = src[adjacency_map[l]];
}
kyrre
  • 626
  • 2
  • 9
  • 24
  • do you want to traverse **all** the cells of the volume, or just the non-empty cells? What computation do you wish to apply to the cells? – Jared Hoberock Aug 01 '11 at 23:27
  • i want to traverse only the non-empty cells as they will be later stored in an array. I've updated the original question with some more information. – kyrre Aug 02 '11 at 09:01

2 Answers2

3

A Hilbert curve would be a tall order. It seems difficult to find a formulation that would allow random access to the indices of the points on the curve.

A Morton ordering, however, would be reasonable, and has some of the same nice properties as it is also a space-filling curve. There is also a random access procedure for finding the Morton number of an N-dimensional point.

What you might consider is a two step process:

  1. Apply a step of stream compaction to your data to select the volume elements that you wish to process

  2. Sort this compacted data, using their Morton indices as the sorting key.

You could use thrust for both the stream compaction and the key-value sort.

This should produce a list of volume elements in a ordering which promotes contiguity. That said, the overhead of reorganizing the data may dominate the cost of the original irregular access pattern.

Jared Hoberock
  • 11,118
  • 3
  • 40
  • 76
1

That sounds quite impossible.

Did you already rule out a kdtree or octree?

The descriptions of kdtree (chapter 21.2) and octree (chapter 21.8) in numerical recipes are quite understandable: http://apps.nrbook.com/rollover/index.html

whoplisp
  • 2,508
  • 16
  • 19