Impelementing atomic CAS in gcc inline

Question

I've been trying to implement a gcc inline function (AT&T assembly) that will perform an atomic CAS operation, but I can't get it to work - the return value is always getting messed up.

I've tried 2 different approaches, each seems to have its own misbehaviours:

1.

static inline int
cas(volatile void * addr, int expected, int newval) 
{
  int result = 1;
  asm volatile("lock; cmpxchgl %3, (%2)\n\t"
               "pushfl\n\t"
              "popl %%ebx\n\t"
              "andl $0x40, %%ebx\n\t"
              "cmpl $0x0, %%ebx\n\t"
              "jnz res%=\n\t"
              "movl $0, %0\n\t"
              "res%=:\n\t"
              : "=m"(result)
              : "a"(expected), "b"(addr), "r"(newval)
              : "memory");

  return result;
}

2.

static inline int cas(volatile void * addr, int expected, int newval) {
  int ret = 1;
  asm volatile("lock; cmpxchgl %3, (%2)\n\t"
          "jz cas_success%=\n\t"
          "movl $0, %0\n\t"
          "cas_success%=:\n\t"
  : "=m"(ret)
  : "a"(expected), "b"(addr), "r"(newval)
  : "memory");
  return ret;
}

But neither work, could anyone point me at the problem with one of the implementations?

Thanks

Why not use the C11 atomic types, or gcc intrinsics? No need to drop to assembly for an atomic CAS. — Shawn, May 11 '20 at 13:40
Instead of loading the flags register into `ebx`, why don't you simple use a `jz` or `jnz` to jump directly on the value of the zero flag? — fuz, May 11 '20 at 13:51
@Shawn I'm doing an assignment and that's a part of the requirements — J. Doe, May 11 '20 at 13:53
So many problems and inefficiencies it's hard to know where to start. `pushf` is obviously insane; ZF is already in FLAGS ready for you to `jz` or `setz` with it instead of extracting it to an integer reg and getting it back into ZF with `and`/`cmp`. `"=m"(ret)` is a pure output, the `ret=1` initializer can thus be optimized away. Also, no reason to use a memory operand, `"+r"` would work. — Peter Cordes, May 11 '20 at 13:54
@J.Doe How do you know it doesn't work? How do you call this code and what output do you expect and get instead? — fuz, May 11 '20 at 13:56
@PeterCordes If I don't initialize ret to 1 how can I be sure I get a non zero value on return? I can't say I understand your other comments. — J. Doe, May 11 '20 at 13:58
@fuz It sometimes returns a big negative number on success, and sometimes even returns 0 — J. Doe, May 11 '20 at 14:00
@J.Doe An `=` constraint says “the assembly will always overwrite whatever value that operand has.” Using an `=` constraint and then not assigning any value to that operand causes the operand to take an indeterminate value. Use a `+` constraint to tell the compiler that the operand's previous value is important. — fuz, May 11 '20 at 14:00
@J.Doe: That's the opposite of what I meant. I said your constraint ignores the fact that you initialized `ret`, so you should fix your constraint to be `"+r"(ret)` (a read/write operand). Notice the difference between `+` and `=` in the GCC manual.: https://gcc.gnu.org/onlinedocs/gcc/Modifiers.html. Or you could just use an `"=r"` 8-bit `_Bool` output with `setz %0` to store ZF into a register as a 0 or 1 integer. See also https://stackoverflow.com/tags/inline-assembly/info for docs and tutorials. — Peter Cordes, May 11 '20 at 14:04
@fuz This seems to work well now, but another problem emerged, I'll have to test some more and I'll return if the code still isn't working. — J. Doe, May 11 '20 at 14:05
Or you could use GCC6 flag-output constraints to tell GCC directly that one of the outputs *is* ZF, avoiding the overhead of a `setz` instruction inside the asm. My answer on [c inline assembly getting "operand size mismatch" when using cmpxchg](https://stackoverflow.com/a/49839296) shows **a fully working C wrapper for `lock; cmpxchg` that has no other asm instructions inside the template**, just the necessary constraints. — Peter Cordes, May 11 '20 at 14:08
@fuz: not really; this question is asking *why* this attempt doesn't work. [Why does my "=r"(var) output not pick the same register as "a"(var) input?](https://stackoverflow.com/q/58238986) is closer to that, but the bug there is that they're expecting `"=r"` to pick the same register as an input operand. I assume there's a dup somewhere for using `"=r"` instead of `"+r"`, which might be all that's wrong with this for correctness. This is nasty for efficiency (branching instead of `setcc` and not letting the compiler pick an addressing mode) but that might be all for correctness. — Peter Cordes, May 11 '20 at 14:14
do you have it working in real assembly and trying to get it working in inline or you dont have it working at all? — old_timer, May 11 '20 at 15:32
@PeterCordes I know it's been a long time, just wondering if you wanted to summarize what you said into an answer, so that I could mark it as solved. The message before the one I said "this seems to work now" is what did the trick. — J. Doe, Jun 02 '20 at 16:10
You could answer your own question if you now understand what was wrong with your original version. I've already written a couple answers about `lock cmpxchg` inline asm, including [c inline assembly getting "operand size mismatch" when using cmpxchg](https://stackoverflow.com/a/49839296), and don't really feel the need to write another one myself. — Peter Cordes, Jun 02 '20 at 21:27

Impelementing atomic CAS in gcc inline

0 Answers0