0

Problem

Suppose I have a big array in global memory filled with mostly 0s, but with some elements about (25%) being numbers:

[9, 0, 0, 0, 7, 0, 0, 3, 0, 0, 0, 0, 5, 0, 0, 0, 8, 0, 2, 0, 0, 4, 0, 0, 0, 0, 5]

Is there any efficient kernel that compresses that array by moving numbers > 0 to the left side?

[9, 7, 3, 5, 8, 2, 4, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

What I've tried

I've thought in many potential solutions, usually taking inspiration from binary reduce algorithms, but those don't really apply, because I'm not compressing all numbers into one, but many. A semi-full initial array, for example, wouldn't be changed considerably. A way to do it, of course, would be to move the array to the CPU and apply the obvious sequential algorithm:

var j = 0;
var x = 0;
for (int i = 0; i < len; ++i) {
  if (arr[i] !== 0) {
    x      = arr[j];
    arr[j] = arr[i];
    arr[j] = x;
  }
  if (arr[j] > 0) {
    j += 1;
  }
}

But the overhead may be significant for big enough arrays. I couldn't find any solution that takes less than O(N) kernel invocations.

MaiaVictor
  • 51,090
  • 44
  • 144
  • 286
  • 3
    google "thrust stream compaction" For basic usage, that is what I would recommend. If you want to write a CUDA or opencl kernel to do it, you can. The basic method is to perform a [parallel prefix sum](https://developer.nvidia.com/gpugems/GPUGems3/gpugems3_ch39.html) on your data (treating each non-zero value as just "1"), then use the prefix sum to identify the correct location in the "output" array to copy each non-zero value to. – Robert Crovella May 18 '18 at 17:34
  • @RobertCrovella that looks like a complete answer actually. Thanks. – MaiaVictor May 18 '18 at 17:46
  • @RobertCrovella just wondering, I was using OpenCL through Rust. I guess thrust will not be available to me, right? Writing a Kernel seems to be the only option. I understand the idea of the algorithm, but the optimizations (blocks, threads, etc.) look overwhelming... – MaiaVictor May 18 '18 at 18:03
  • [boost::compute](https://github.com/boostorg/compute) has some similarities to thrust. Try [this](https://stackoverflow.com/questions/39402091/boostcompute-stream-compaction). – Robert Crovella May 18 '18 at 18:09
  • @RobertCrovella great! Feels like I'm getting closer, but then I guess there is no way to use this on Rust, right? Is "ArrayFire" by chances similar to boost::compute and Thrust? – MaiaVictor May 18 '18 at 18:15

0 Answers0