cuda atomicMin
operation seems only find the minimum value of a device memory trunk. But, is there anyway to find which block/thread finally find this minimum value? I have compute-2.0.
Asked
Active
Viewed 170 times
0

Hailiang Zhang
- 17,604
- 23
- 71
- 117
1 Answers
2
If you are doing an atomicMin
on a 32-bit value, you can use a generalized atomic operation on a 64 bit value, 32 bits of which represent the minimized value, and 32bits of which represent the global index of the thread. A general approach is outlined here.
Since 64 bit atomicMin
is only supported on cc 3.5 devices, I assume you are finding 32-bit minimum values.
If you are working with 64-bit values, then you can use a parallel reduction technique to carry both the minimum (or maximum) value and the index through the reduction. This question/answer demonstrates a parallel reduction approach which finds both maximum and index, per row of a matrix.

Community
- 1
- 1

Robert Crovella
- 143,785
- 11
- 213
- 257