No, there is no need to use instructions MFENCE, SFENCE and LFENCE
in relation with LOCK
prefix.
MFENCE, SFENCE and LFENCE
instruction guarantee visibility of memory in all CPU cores. On instance the MOV
instruction can't be used with LOCK
prefix, so to be sure that result of memory move is visible to all CPU cores we must be sure that CPU cache is flushed to RAM and that we reach with fence instructions.
EDIT: more about locked atomic operations from Intel manual:
LOCKED ATOMIC OPERATIONS
The 32-bit
IA-32 processors support locked atomic
operations on locations in system
memory. These operations are typically
used to manage shared data structures
(such as semaphores, segment
descriptors, system segments, or page
tables) in which two or more
processors may try simultaneously to
modify the same field or flag. The
processor uses three interdependent
mechanisms for carrying out locked
atomic operations:
• Guaranteed atomic operations
• Bus locking, using the LOCK# signal and the LOCK instruction prefix
• Cache coherency protocols that insure that atomic operations can be carried out on cached data structures (cache lock); this mechanism is present in the Pentium 4, Intel Xeon, and P6 family processors
These mechanisms are interdependent in
the following ways. Certain basic
memory transactions (such as reading
or writing a byte in system memory)
are always guaranteed to be handled
atomically. That is, once started, the
processor guarantees that the
operation will be completed before
another processor or bus agent is
allowed access to the memory location.
The processor also supports bus
locking for performing selected memory
operations (such as a
read-modify-write operation in a
shared area of memory) that typically
need to be handled atomically, but are
not automatically handled this way.
Because frequently used memory
locations are often cached in a
processor’s L1 or L2 caches, atomic
operations can often be carried out
inside a processor’s caches without
asserting the bus lock. Here the
processor’s cache coherency protocols
insure that other processors that are
caching the same memory locations are
managed properly while atomic
operations are performed on cached
memory locations.