59

Apparently, x86 (and probably a lot of other instruction sets) put both the quotient and the remainder of a divide operation in separate registers.

Now, we can probably trust compilers to optimize a code such as this to use only one call to divide:

( x / 6 )
( x % 6 )

And they probably do. Still, do any languages (or libraries, but mainly looking for languages) support giving both the divide and modulo results at the same time? If so, what are they, and What does the syntax look like?

Tilman Vogel
  • 9,337
  • 4
  • 33
  • 32
Ben
  • 7,692
  • 15
  • 49
  • 64

9 Answers9

56

C has div and ldiv. Whether these generate separate instructions for the quotient and remainder will depend on your particular standard library implementation and compiler and optimization settings. Starting with C99, you also have lldiv for larger numbers.

bta
  • 43,959
  • 6
  • 69
  • 99
  • Interesting to note that modulo alone is not implemented with `div` in 4.8: http://stackoverflow.com/questions/4361979/how-does-the-gcc-implementation-of-module-work-and-why-does-it-not-use-the – Ciro Santilli OurBigBook.com Oct 20 '15 at 20:30
  • Went ahead and accepted this answer. I know there are still many valid answers here, so its hard to say that one is 'more right' than the others, but C is a good starting place to talk about these things. – Ben Dec 02 '15 at 18:43
  • 5
    **Do not use this with current compilers, especially for division by a constant: it doesn't optimize**. See https://godbolt.org/g/ydL27q: `div(var, 10)` compiles into an actual function call, and the `div` library implementation doesn't have the information that the divisor is a constant `10`. So it can't use [a multiplicative inverse](https://stackoverflow.com/questions/41183935/why-does-gcc-use-multiplication-by-a-strange-number-in-implementing-integer-divi). Even with a runtime-variable divisor, you get larger code size and a non-inline function call on x86. – Peter Cordes Aug 25 '17 at 19:30
  • 1
    If these functions were necessary or commonly used, compilers would have builtin versions of them (like they do for `memcpy`), but they aren't, so they don't optimize nearly as well as they could. Thus, it's much better to just use `n / d` and `n % d` nearby, with exactly the same `n` and `d`, so the compiler can reliably optimize division by a constant, constant-propagation, and range information. (e.g. the compiler knows the result of dividing by 10 has a smaller range than the full `int` range, maybe allowing it to optimize later.) – Peter Cordes Aug 25 '17 at 19:33
  • Anyway, this is a good answer to this question, but it's important to point out that you shouldn't actually use this part of the language, unless you first implement fast support for it in the compiler you're using. For 64-bit division on a 32-bit machine, or other cases where it can't just inline something tiny, gcc already calls an internal `libgcc.a` function, so you probably never even save code-size with these functions. – Peter Cordes Aug 25 '17 at 19:34
  • @PeterCordes Be careful not to over-generalize. Whether something gets optimized or not depends on the implementation of your specific compiler and/or standard library. I've seen implementations with an optimized `div` and others without it. Always test *your* specific implementation before assuming that something is or isn't optimized. – bta Oct 24 '17 at 19:15
  • @bta: Fair point, I meant current x86 compilers (and provided a Godbolt link to test gcc/clang/ICC/MSVC). Are there any implementations where you get *better* code from calling `div` than from writing `r = x%y; q = x/y;`? That's common, so most good compilers will optimize it unless aliasing analysis stops CSE from working (e.g. `x` and `y` aren't simple local variables). If calling `div` is the best way on that platform, that's what compilers could do for open-coded `%` and `/`. Sure it's good to test, but my guess is that open-coding is the safer bet across more implementations. – Peter Cordes Oct 24 '17 at 20:52
  • 1
    I've definitely seen a `div()` function call optimized to get both results from a single `DIV` instruction, where separate `/` and `%` statements effectively run the entire computation twice (I don't recall which compiler, it was an embedded platform though). If `x` is `volatile`, your results may change for completely separate reasons. Again, always test implementation-specific behavior with your specific use case before optimizing for/around it. – bta Oct 24 '17 at 21:45
34

Python does.

>>> divmod(9, 4)
(2, 1)

Which is odd, because Python is such a high level language.

So does Ruby:

11.divmod(3) #=> [3, 2]

*** EDIT ***

It should be noted that the purpose of these operators is probably not to do the work as efficiently as possible, it is more likely the functions exist for correctness/portability reasons.

For those interested, I believe this is the code of the Python implementation for integer divmod:

static enum divmod_result
i_divmod(register long x, register long y,
     long *p_xdivy, long *p_xmody)
{
long xdivy, xmody;

if (y == 0) {
    PyErr_SetString(PyExc_ZeroDivisionError,
                    "integer division or modulo by zero");
    return DIVMOD_ERROR;
}
/* (-sys.maxint-1)/-1 is the only overflow case. */
if (y == -1 && UNARY_NEG_WOULD_OVERFLOW(x))
    return DIVMOD_OVERFLOW;
xdivy = x / y;
/* xdiv*y can overflow on platforms where x/y gives floor(x/y)
 * for x and y with differing signs. (This is unusual
 * behaviour, and C99 prohibits it, but it's allowed by C89;
 * for an example of overflow, take x = LONG_MIN, y = 5 or x =
 * LONG_MAX, y = -5.)  However, x - xdivy*y is always
 * representable as a long, since it lies strictly between
 * -abs(y) and abs(y).  We add casts to avoid intermediate
 * overflow.
 */
xmody = (long)(x - (unsigned long)xdivy * y);
/* If the signs of x and y differ, and the remainder is non-0,
 * C89 doesn't define whether xdivy is now the floor or the
 * ceiling of the infinitely precise quotient.  We want the floor,
 * and we have it iff the remainder's sign matches y's.
 */
if (xmody && ((y ^ xmody) < 0) /* i.e. and signs differ */) {
    xmody += y;
    --xdivy;
    assert(xmody && ((y ^ xmody) >= 0));
}
*p_xdivy = xdivy;
*p_xmody = xmody;
return DIVMOD_OK;
}
phuclv
  • 37,963
  • 15
  • 156
  • 475
Eloff
  • 20,828
  • 17
  • 83
  • 112
  • Does `divmod` runs only one operation? What is the code behind this function? – BrunoLM Oct 09 '10 at 00:08
  • Beat me to it. divmod() is a built-in function in Python. – Russell Borogove Oct 09 '10 at 00:08
  • @BrunoLM I would bet a large quantity of [insert favorite beverage] that `divmod` simply performs both operations separately and packages the results, but have no proof to offer. – Andrew Barber Oct 09 '10 at 00:09
  • @BrunoLM: The VM calls a native function, which I would hope does a native div instruction. – Russell Borogove Oct 09 '10 at 00:13
  • @Russell : hehe; i actually worded my potential bet incorrectly! What I meant was, I don't think it's trying to pull any low-level 'tricks' to make the operation efficient, but instead is just a way to save a few keystrokes for the dev. :-P – Andrew Barber Oct 09 '10 at 00:19
  • Doesn't Python signed integer division always give positive modulo? Unlike C and x86 asm `div`, where the remainder can be negative remainder. So it definitely can't *just* use x86 `div`, even if the Python integer is a single "chunk" (not a bignum). That's why the C implementation uses `/` but not `%`. – Peter Cordes Nov 21 '20 at 21:26
  • CRuby implementation of divmod: `return rb_assoc_new(num_div(x, y), num_modulo(x, y));` – Simon Perepelitsa Aug 03 '22 at 19:46
11

In C#/.NET you've got Math.DivRem: http://msdn.microsoft.com/en-us/library/system.math.divrem.aspx

But according to this thread this isn't that much an optimization.

Community
  • 1
  • 1
elmattic
  • 12,046
  • 5
  • 43
  • 79
4

In Java (since 1.5) the class BigDecimal has the operation divideAndRemainder returning an array of 2 elements with the result and de remainder of the division.

BigDecimal bDecimal = ...
BigDecimal[] result = bDecimal.divideAndRemainder(new BigDecimal(60));

Java 17 Javadoc: https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/math/BigDecimal.html#divideAndRemainder(java.math.BigDecimal)

Iván
  • 129
  • 1
  • 7
3

The .NET framework has Math.DivRem:

int mod, div = Math.DivRem(11, 3, out mod);
// mod = 2, div = 3

Although, DivRem is just a wrapper around something like this:

int div = x / y;
int mod = x % y;

(I have no idea whether or not the jitter can/does optimise that sort of thing into a single instruction.)

LukeH
  • 263,068
  • 57
  • 365
  • 409
3

Common Lisp does: http://www.lispworks.com/documentation/HyperSpec/Body/f_floorc.htm

Pedro Rodrigues
  • 1,732
  • 11
  • 17
3

As Stringer Bell mentioned there is DivRem which is not optimized up to .NET 3.5.

On .NET 4.0 it uses NGen.

The results I got with Math.DivRem (debug; release = ~11000ms)

11863
11820
11881
11859
11854

Results I got with MyDivRem (debug; release = ~11000ms)

29177
29214
29472
29277
29196

Project targeted for x86.


Math.DivRem Usage example

int mod1;
int div1 = Math.DivRem(4, 2, out mod1);

Method signatures

DivRem(Int32, Int32, Int32&) : Int32
DivRem(Int64, Int64, Int64&) : Int64

.NET 4.0 Code

[TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")]
public static int DivRem(int a, int b, out int result)
{
    result = a % b;
    return (a / b);
}

.NET 4.0 IL

.custom instance void System.Runtime.TargetedPatchingOptOutAttribute::.ctor(string) = { string('Performance critical to inline across NGen image boundaries') }
.maxstack 8
L_0000: ldarg.2 
L_0001: ldarg.0 
L_0002: ldarg.1 
L_0003: rem 
L_0004: stind.i4 
L_0005: ldarg.0 
L_0006: ldarg.1 
L_0007: div 
L_0008: ret 

MSDN Reference

phuclv
  • 37,963
  • 15
  • 156
  • 475
BrunoLM
  • 97,872
  • 84
  • 296
  • 452
  • 4
    This answer is a little misleading since the times that jump out at you appear to show Math.DivRem being optimized in .Net 4.0 but as you note off to the side a bit, it is in fact not optimized at all. In fact, in my tests Math.DivRem() is slightly SLOWER than the naive div and mod ops alone, on all versions of .Net. In other words, it is not at all optimized. – Chris Moschini Jul 27 '14 at 20:54
  • That's because this is benchmarking debug mode, so anything in your own code is going to be terrible compared to calling an already-compiled library function. It does mention that "release" times are about equal, which is what matters. (But I think "optimized" in this case means "not worse than letting the compiler [CSE](https://en.wikipedia.org/wiki/Common_subexpression_elimination) `x/y` and `x%y` in an open-coded version, and in .NET3.5 it may actually have been worse?) – Peter Cordes Nov 21 '20 at 21:30
2

Haskell has both divMod and quotRem that latter of which corresponds directly to the machine instruction (according to Integral operators quot vs. div) while divMod may not.

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
Jon Wolski
  • 2,293
  • 2
  • 19
  • 21
0
    int result,rest;
    _asm
    {
        xor edx, edx // pone edx a cero; edx = 0
        mov eax, result// eax = 2AF0
        mov ecx, radix // ecx = 4
        div ecx
        mov val, eax
        mov rest, edx
    }
Ernesto Alfonso
  • 650
  • 10
  • 30
  • 2
    The compiler can already do this optimization. Using MSVC's clunky bad inline asm this way just forces some store/reload round trips. Also, you're doing unsigned division, so the variables should be `unsigned`, not `int`. Also, never use `div` for a known power of 2, such as 4. Use `shr`/`and` to get quotient and remainder. – Peter Cordes Nov 21 '20 at 21:33