1

I want to tag a number of objects using a CUDA kernel function. The main purpose is to find those objects which were not tagged by any thread. I want ro use competitive write to achieve this, i.e. each thread write TRUE to a array in which every location corresponds to an object, during this there may be several threads write to the same location at the same time. The initial value of this array is FALSE. If it remains FALSE after the operation, I would thus know the object hasn't been tagged by any thread.

Is my idea a good choice? Or should I use some other features like atomicAdd() ? I do not need to know exactly how many threads had wrote.

talonmies
  • 70,661
  • 34
  • 192
  • 269
Wesley Ranger
  • 770
  • 1
  • 7
  • 26

1 Answers1

3

OK, I have already find the anwser via "Related Questions":

For a CUDA program, if multiple threads in a warp write to the same location then the location will be updated but it is undefined how many times the location is updated (i.e. how many actual writes occur in series) and it is undefined which thread will write last (i.e. which thread will win the race).

For devices of compute capability 2.x, if multiple threads in a warp write to the same address then only one thread will actually perform the write, which thread is undefined.

From the CUDA C Programming Guide section F.4.2:

If a non-atomic instruction executed by a warp writes to the same location in global memory for more than one of the threads of the warp, only one thread performs a write and which thread does it is undefined.

See also section 4.1 of the guide for more info.

In other words, if all threads writing to a given location write the same value, then it is safe.

Tom's answer

Well, I think it a good way to achieve the goal using competitive write.

Community
  • 1
  • 1
Wesley Ranger
  • 770
  • 1
  • 7
  • 26