I was reading about the MESI snooping cache coherence protocol, which I guess is the protocol that is used in modern multicore x86 processors (please correct me if I'm wrong). Now that article says this at one place.
A cache that holds a line in the Modified state must snoop (intercept) all attempted reads (from all of the other caches in the system) of the corresponding main memory location and insert the data that it holds. This is typically done by forcing the read to back off (i.e. retry later), then writing the data to main memory and changing the cache line to the Shared state.
Now what I don't understand is why the data needs to be written in the main memory. Cant the cache coherence just keeps the content in the caches synchronized without going to the memory (unless the cache line is truly evicted ofcourse)? I mean if one core is constantly reading and the other constantly writing, why not keep the data in the cache memory, and keep updating the data in the cache. Why incur the performance of writing back to the main memory?
In other words, can't the cores reading the data, directly read from the cache of the writing core and modify their cache accordingly?