2

I am trying to get a modulo-n (0 < n) in the range [0 ... n-1].

The instruction idiv n divides EDX:EAX (64 bits) by n, and leaves
EAX=quotient, EDX=remainder. (n is a register in my code)

My problem is that when the content of EDX:EAX is negative, I get a negative result in EDX.

The simplest solution I found was:

cdq            ; extend EAX sign bit to EDX
idiv n         ; edx = (possibly neg.) remainder
add edx, n
mov eax, edx   ; eax = remainder + n
cdq
idiv n         ; edx = positive remainder

Is there a cleaner/easier/quicker way to get the positive remainder?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
S. Golan
  • 31
  • 4
  • 1
    It's called remainder. Quotient is the result of division. – Ped7g Nov 29 '16 at 09:25
  • Now I see you didn't specify `n` to be positive explicitly, I deducted that one from *"edx:eax is negative => negative remainder"*, which can happen only for positive `n`. ("non-negative" as well, because `n==0` will fail sooner :) ). – Ped7g Nov 29 '16 at 10:57
  • 1
    @Ped7g, Thanks for your comments. I will edit and change 'quotient' to 'remainder'. About n, I did mention that 0 < n. – S. Golan Nov 30 '16 at 12:19

2 Answers2

4

-5 mod 3 = -2 (remainder, -1 quotient)

patched remainder: -2 + 3 = +1, that's what you want, right?

Quotient is then -1 - 1 = -2.

Verification: -2 * 3 + +1 = -5

cdq            ; extend EAX sign bit to EDX
idiv n         ; edx = (possibly neg.) remainder
mov eax, edx   ; eax = copy of remainder
add edx, n     ; negative remainder modified to positive one (for --quotient)
               ; when going from negative to positive, CF will be set (unsigned overflow)
cmovc eax,edx  ; when CF, load eax with modified remainder
; eax = to-be-positive-adjusted remainder

I didn't verify it in debugger, and I merely woke up, so test it first.

Ped7g
  • 16,236
  • 3
  • 26
  • 63
  • Yeah, looks right for positive divisors at least. Wrapping around from negative to positive sets CF. (That's *not* signed overflow though, so I'd suggest calling it "wraparound" instead of `(overflow)` to avoid confusion.) – Peter Cordes Nov 29 '16 at 09:34
  • 1
    I think this breaks for large positive remainders, though: if `n + remainder` carries, then you'll get `n + remainder` instead of a positive remainder. – Peter Cordes Nov 29 '16 at 09:36
  • @PeterCordes been thinking about it.. He's using `idiv`, so `n` is signed 32b. Then MAX_N = `0x7F...FF` . Max remainder is `0x7F...FE`. `0x7F...FF` + `0x7F...FE` = `0xFF...FD`, **CF = 0**. So as long as `n` is signed and positive, it should work correctly. EDIT: and I'm not sure about that "wraparound" wording either, the `add edx,n` does produce classic unsigned `add` overflow, when you look at it without signed context. I added at least "unsigned" to it, but "overflow" feels right for me in this case? Depends probably how you read the code in mind, if you do negative+n or uint+n? – Ped7g Nov 29 '16 at 10:45
  • Ahh, yes good point. I agree, `MAX_INT + MAX_INT` doesn't carry, and the OP has said that the divisor is non-negative, so there's no problem. – Peter Cordes Nov 29 '16 at 10:50
  • Anyway, at least you made me to think about it, I was simply ignoring it before, because I found it very unlikely the divisors are large.. but better to *know* for sure. :) Thanks. – Ped7g Nov 29 '16 at 10:52
  • 2
    Turns out we can get clang to emit this asm given the right C input (see my answer). gcc still wants to use a CMP instead of reading CF from the ADD. Wouldn't have been worth a whole answer except that I also found an interesting paper, and what this kind of modulo is called mathematically. – Peter Cordes Nov 29 '16 at 13:18
4

The kind of division that produces a non-negative modulo is called Euclidean division.

For C implementations and more background, see Division and Modulus for Computer Scientists. (also related: What's the difference between “mod” and “remainder”?).

This is a special case, where we know the divisor is positive, which allows the faster implementation suggested by Ped7g. It's probably optimal or at least close to it.

Interestingly, we can write it in C in a way that compiles to the same asm as Ped7g's hand-written version (or close to it, depending on gcc vs. clang). See it on the Godbolt compiler explorer.

// https://stackoverflow.com/questions/40861023/how-do-i-get-a-positive-modulo-on-a-negative-dividend
// uses unsigned carry to detect when a negative 2's complement integer wrapped to non-negative
long modE_positive_divisor_2s_complement( long D, long d )
{
  long r = D%d;

  unsigned long tmp = (unsigned long)r + (unsigned long)d;
  // detect carry by looking for unsigned wraparound:
  // that means remainder was negative (not zero), so adding the (non-negative) divisor is what we need
  r = (tmp < (unsigned long)r) ? (long)tmp : r;

  return r;
}

With clang3.9 -O3, we get:

    mov     rax, rdi
    cqo
    idiv    rsi
    add     rsi, rdx
    cmovae  rsi, rdx
    mov     rax, rsi
    ret

gcc6.2 uses an extra CMP before the CMOV :(

On Godbolt, I included the original general-case C function and a version that tells the compiler to assume positive d (using __builtin_unreachable()). For the latter, gcc does about as well as with this, using a test rdx,rdx / cmovs to do the conditional add.

It works identically for 32-bit vs. 64-bit long (with EAX instead of RAX, etc.), so use int32_t if you care about performance and don't need 64-bit integers. (idiv r64 is much slower than idiv r32).

Community
  • 1
  • 1
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847