Why is modulus operator slow?

Question

Paraphrasing from in "Programming Pearls" book (about c language on older machines, since book is from the late 90's):

Integer arithmetic operations (+, -, *) can take around 10 nano seconds whereas the % operator takes up to 100 nano seconds.

Why there is that much difference?
How does a modulus operator work internally?
Is it same as division (/) in terms of time?

As an exercise, write the most naive version of, say, division, and then modulus. Count the instructions for each which would be required prior to optimization. Obviously there will be more performant ways to do it (before you even get to CPU level optimizations), but it will give you an idea of the difference. — Ed S., Jan 16 '15 at 05:35
I surprised division is reported to be about the same as *,-,+. Even on new processors division is many times slower. — SunsetQuest, Jan 16 '15 at 05:46
What language? And what is the divisor? And what is the type you are calling modulus on-int or double or float? — Alex Brown, Jan 16 '15 at 07:07
@AlexBrown ..Language:C,By modulus operator,I mean "%" opeeator.Say for example-: 23413%34 — AV94, Jan 16 '15 at 08:13
Aha! Reformatted your question so I can appreciate it in these terms. — Alex Brown, Jan 16 '15 at 08:50

Alex Brown · Accepted Answer · 2015-01-19T07:36:52.697

The modulus/modulo operation is usually understood as the integer equivalent of the remainder operation - a side effect or counterpart to division.

Except for some degenerate cases (where the divisor is a power of the operating base - i.e. a power of 2 for most number formats) this is just as expensive as integer division!

So the question is really, why is integer division so expensive?

I don't have the time or expertise to analyze this mathematically, so I'm going to appeal to grade school maths:

Consider the number of lines of working out in the notebook (not including the inputs) required for:

Equality: (Boolean operations) essentially none - in computer "big O" terms this is known a O(1)
addition: two, working left to right, one line for the output and one line for the carry. This is an O(N) operation
long multiplication: n*(n+1) + 2: two lines for each of the digit products (one for total, one for carry) plus a final total and carry. So O(N^2) but with a fixed N (32 or 64), and it can be pipelined in silicon to less than that
long division: unknown, depends upon the argument size - it's a recursive descent and some instances descend faster than others (1,000,000 / 500,000 requires less lines than 1,000 / 7). Also each step is essentially a series of multiplications to isolate the closest factors. (Although multiple algorithms exist). Feels like an O(N^3) with variable N

So in simple terms, this should give you a feel for why division and hence modulo is slower: computers still have to do long division in the same stepwise fashion tha you did in grade school.

If this makes no sense to you; you may have been brought up on school math a little more modern than mine (30+ years ago).

The Order/Big O notation used above as O(something) expresses the complexity of a computation in terms of the size of its inputs, and expresses a fact about its execution time. http://en.m.wikipedia.org/wiki/Big_O_notation

O(1) executes in constant (but possibly large) time. O(N) takes as much time as the size of its data-so if the data is 32 bits it takes 32 times the O(1) time of the step to calculate one of its N steps, and O(N^2) takes N times N (N squared) the time of its N steps (or possibly N times MN for some constant M). Etc.

In the above working I have used O(N) rather than O(N^2) for addition since the 32 or 64 bits of the first input are calculated in parallel by the CPU. In a hypothetical 1 bit machine a 32 bit addition operation would be O(32^2) and change. The same order reduction applies to the other operations too.

Actually, if you were to work out multiplication in a notebook you might consider using Karatsuba's method, or if you're a bit crazy - FFT. See [here](https://en.wikipedia.org/wiki/Multiplication_algorithm). — einpoklum, May 23 '16 at 09:17

Why is modulus operator slow?

1 Answers1

Linked