I read the Intel manual and found there is a lock prefix for instructions, which can prevent processors writing to the same memory location at the same time. I am quite excited about it. I guess it could be used as hardware mutex. So I wrote a piece of code to have a shot. The result is quite frustrating. The lock does not support MOV or LEA instructions. The manual says LOCK only supports ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG. What is more, if the LOCK prefix is used with one of these instructions and the source operand is a memory operand, an undefined opcode exception (#UD) may be generated.
I wonder why so many limitations, so many restrictions make LOCK seem useless. I cannot use it to guarantee a general write operation not have dirty data or other problems caused by parallelism.
E.g. I wrote code ++(*p) in C. p is pointer to a shared memory. The corresponding assembly is like:
movl 28(%esp), %eax
movl (%eax), %eax
leal 1(%eax), %edx
movl 28(%esp), %eax
movl %edx, (%eax)
I added "lock" before "movl" and "leal", but the processor complains "Invalid Instruction". :-( I guess the only way to make the write operations serialized is to use software mutex, right?