We have multiple threads calculating results but needing to write the results in one array serially. We've tried using atomicCAS in the example below. In some parts of the code it works, in other parts of the code it hangs because of warp divergence. It doesn't matter which order the threads write the results, but they should never try to write the array at the same time.
while (atomicCAS(&arrayAccess, AVAILABLE, NOT_AVAILABLE) == NOT_AVAILABLE);
arrayGlobalMemory[count] = result;
count++;
atomicExch(&arrayAccess, AVAILABLE);
This answers below says that it isn't possible. It's pretty basic functionality. It seems that parallel access to an array should be serializable? Can someone suggest how to modify the code to get serialized array access from parallel threads, or can someone show some sample code that works correctly?