0

cuda atomicMin operation seems only find the minimum value of a device memory trunk. But, is there anyway to find which block/thread finally find this minimum value? I have compute-2.0.

Hailiang Zhang
  • 17,604
  • 23
  • 71
  • 117

1 Answers1

2

If you are doing an atomicMin on a 32-bit value, you can use a generalized atomic operation on a 64 bit value, 32 bits of which represent the minimized value, and 32bits of which represent the global index of the thread. A general approach is outlined here.

Since 64 bit atomicMin is only supported on cc 3.5 devices, I assume you are finding 32-bit minimum values.

If you are working with 64-bit values, then you can use a parallel reduction technique to carry both the minimum (or maximum) value and the index through the reduction. This question/answer demonstrates a parallel reduction approach which finds both maximum and index, per row of a matrix.

Community
  • 1
  • 1
Robert Crovella
  • 143,785
  • 11
  • 213
  • 257