I'm diving into multi-threaded programming and thinking about lock-free reference counting using atomic operations.
It's obvious, that atomic operation could be slower than non-atomic operations at least on constant scale. My worries are about other CPU synchronizations to perform atomic operations.
I wonder whether (if, and how much) execution of atomic operation on core A affects performance of other cores which:
- have nothing related to core A
- are executing different threads of same process as core A
- are executing atomic operation
- are executing atomic operation and are executing different threads of same process as core A
- are executing any memory related operation, ie. load, store,...
- are executing any memory related operation in same memory region (cache line, page?) as core A