@Hot_Licks: Actually, if the threads are two hyperthreads running on the same core, then there is no problem with having different threads access them, in either read or write. Clean lines are shared cost-free between hardware threads on the same Intel CPU. Even dirty lines are shared very cheaply - although you can get MOnukes if one guy is reading the data at the same time the other is writing. (Oddly enough, no penalty if two such hardware/hyperthreads are writing at the same time.)
With AMD's only "threaded" CPU, Bulldozer, I think that write sharing is even less costly.
But this applies only to hardware threads, e.g. Intel hyperthreads or logical processors, running on the same physical processors. If they are running on different physical processors, no win. Since most software threading packages migrate threads arbitrarily, your rule is not so bad.
Nevertheless, you still want to minimize (a) lines accessed by a single thread, and (b) the total of lines accessed by multiple threads, even if not shared by other threads. Since caches - MLC, LLC - are a limited resource. But you are right - once you are missing the cache...