The data in a cache is grouped into blocks called cache-lines, which are typically 64 or 128 bytes wide. These are the smallest units of memory that can be read from, or written to, main memory. This works well in most programs as data that is close in memory is often needed close in time by a particular thread. However, this is the root of the false sharing problem.
When a program writes a value to memory, it goes firstly to the cache of the core that ran the code. If any other caches hold a copy of that cache line, their copy is marked as invalid and cannot be used. The new value is written to main memory, and the other caches must re-read it from there if they need it. Although this synchronization is implemented in hardware, it still takes time. And, of course, reading from main memory takes a few hundred clock cycles by itself.
Modern processors use the MESI protocol to implement cache coherence. This basically means eachcache line can be in one of four states:
When a core modifies any data in a cache line, it transitions to "Modified", and any other caches that hold a copy of the same cache line are forced to "Invalid". The other cores must then read the data from main memory next time they need it. That's all I need to say about this here. A detailed explanation is available on Wikipedia[^], if you're interested.
Imagine two different variables are being used by two different threads on two different cores. This appears to be embarrassingly parallel, as the different threads are using isolated data. However, if the two variables are located in the same cache line and at least one is being written, then there will be contention for that cache line. This is false sharing.
It is called false sharing because even though the different threads are not sharing data, they are, unintentionally, sharing a cache line.
原文出处:http://www.codeproject.com/Articles/51553/Concurrency-Hazards-False-Sharing