23

Division in processor takes much time, so I want to ask how to check in fastest way if number is divisible of some other number, in my case I need to check if number is divisible by 15.

Also I've been looking through web and found fun ways to check if number is divisible of some number, but I'm looking for fast option.

NOTE: as division takes much time I'm looking for answer without / and %.

  • 30
    `a % 15 == 0`: trust the compiler and hardware to do it efficiently. – recursion.ninja Sep 09 '13 at 20:37
  • 4
    @awashburn modulus is not the fastest option in some cases – ST3 Sep 09 '13 at 20:39
  • 5
    @user2623967 hence the word *trust*. The compiler most likely knows better then you how to optimize for the local system. – recursion.ninja Sep 09 '13 at 20:40
  • @user2623967 it probably is, the compiler is written by people whose job it is to know these tricks, compilers are very good at optimizing these sort of things as opposed to actual algorithms so in this case you are probably incorrect – aaronman Sep 09 '13 at 20:41
  • @awashburn A year ago I've been playing with similar situations, and foudout that it is ossible to write better code then compiler generates (of course not in all cases) – ST3 Sep 09 '13 at 20:42
  • @user2623967 prove it then, write a test to see if your function runs faster than mod, my guess is not even close to it – aaronman Sep 09 '13 at 20:43
  • @user2623967 anything other then `%` is unlikely to be an **optimal** ***and*** **portable** solution. – recursion.ninja Sep 09 '13 at 20:45
  • I'm glad that I got an answer, but the funiest part, is highly downvoted. –  Sep 09 '13 at 21:24
  • @WhiteMan the reason it got downvoted originally is because someone posted a ideone where it did not work – aaronman Sep 09 '13 at 22:12
  • 1
    possible duplicate of [Check if a number is divisible by 3](http://stackoverflow.com/questions/3421609/check-if-a-number-is-divisible-by-3) (my answer has solution for N=15 too) – MSalters Sep 09 '13 at 23:32
  • @MSalters i would say this is not a dupe, the question you answered is about how to do it without `/`,`%`,or `*`, when all the answers here, except the accepted one say you should just use `%`, please don't close – aaronman Sep 10 '13 at 01:21
  • 2
    @awashburn: "trust the hardware and compiler to do it efficiently" is unfortunately not true as some compiler don't implement non-division solutions when while others - like gcc - do. When they don't you're stuck with inefficient code and you have to resort to manually implementing the code. – Olof Forshell Sep 10 '13 at 09:14
  • 1
    @P0W: divisions DO take time, a lot of it. Not all compilers implement alternate, MUCH faster solutions. So sure you can lose faith in compilers. It's not fun, but sometimes you have to. – Olof Forshell Sep 10 '13 at 09:20
  • Even GCC doesn't do this optimally. It uses the "divide by multiply"-algorithm and then uses that to extract the remainder, which is much better than a naive modulo, but not optimal. See chapter 9 for details: http://gmplib.org/~tege/divcnst-pldi94.pdf – harold Sep 13 '13 at 12:27
  • 1
    @harold: which may be superseeded by this paper: https://gmplib.org/~tege/division-paper.pdf – Olof Forshell Apr 09 '14 at 12:40

6 Answers6

38

Obligatory answer for other learners who might come looking for an answer.

if (number % n == 0)

In most cases, you can always do this, trusting the smart modern compilers.

This doesn't mean you get discouraged from learning fun ways though. Check out these links.

Fast divisibility tests (by 2,3,4,5,.., 16)?

Bit Twiddling Hacks

Community
  • 1
  • 1
max
  • 4,248
  • 2
  • 25
  • 38
  • 5
    "Obligatory answer?" Learners want to learn, the really interested ones expand the envelope. "Trusting the smart modern compilers?" Yeah, right. In your example I can trust the smart modern compiler to use a 60+ clock cycle divide instruction which is what compilers always do when the divisor is variable. Silly anser. – Olof Forshell Sep 10 '13 at 08:59
  • 2
    And what if the "fun ways" are the better ways, the faster ways, the more efficient ways? Then they should replace the current ways. – Olof Forshell Sep 10 '13 at 12:20
  • 1
    @OlofForshell When divisor is variable, there is not much else you can do anyway. – Erbureth Apr 09 '14 at 11:08
  • @Erbureth: there are almost always things that can be done. That they may be more or less feasible is another issue. If the programmer knows the range of values for n that might lead to openings. Certainly for the {2, 3, 4 ... 16} set there are generic possibilities which might clobber performance for the {2, 4, 8, 16} set but speed up the others. Usually it's about trading space for time and if you're in a jam, what choice do you have other than to try? Or if you want to explore the possibilities, learning something all by yourself - not being told how something should be done? – Olof Forshell Apr 09 '14 at 12:20
27

Multiplication takes less time then division, so you can try this:

inline bool divisible15(unsigned int x)
{
    //286331153 = (2^32 - 1) / 15
    //4008636143 = (2^32) - 286331153
    return x * 4008636143u <= 286331153u;
}

This way works because 2^32-1 (max 32-bit value) is divisible of 15, however if you take, for example 7, it would look like working, but wouldn't work in all cases.

EDIT: See this, it proves that this solution (on some compilers) is faster then module.

EDIT: Here is the explanation and the generalization.

Cassio Neri
  • 19,583
  • 7
  • 46
  • 68
ST3
  • 8,826
  • 3
  • 68
  • 92
  • 1
    There are working solutions for 7 and other non-power-of-2 values. gcc implements these solutions when the divisor is constant. I got a real fright the first time I saw the generated assembly. But it works and it's fast. – Olof Forshell Sep 10 '13 at 09:05
  • @OlofForshell I used 7 as an example number which do not work in this case, I know it is other ways to check if value is divisible, but in 15 case, this is probably fastest. – ST3 Sep 10 '13 at 09:54
  • 7 may be tested in the same manner but using different constants. For the moment the multiplication method is the fastest but who says that won't ever change? – Olof Forshell Sep 10 '13 at 12:16
  • @OlofForshell hmmm... not sure about that, I think it may not work with all values, can you give me those constants? – ST3 Sep 10 '13 at 12:28
  • On GCC, this isn't because multiplication takes less time than division, but because on that specific benchmark the multiplication in the loop is optimized (via induction variable optimization) before modulo is turned into a multiplication+simple arithmethic and shifts (first RTL transformation). – ninjalj Sep 10 '13 at 13:32
  • When compiling with -O3, modulo is slightly faster than the multiplication method: http://coliru.stacked-crooked.com/a/99507cde35389c48 – interjay Sep 10 '13 at 17:04
  • 2
    @user2623967: I spent some time looking for this (http://homepage.cs.uiowa.edu/~jones/bcd/divide.html) link which explains a more structured approach and you can also check two publications here (http://gmplib.org/~tege/) concerning division by invariant (constant) integers. – Olof Forshell Sep 12 '13 at 08:03
  • may be it's because I am little slow in understanding things I am unable to understand given solution @ST3 can you please tell me how it works, means how code written by you assures that number is div. by 15 in most cases. – r.bhardwaj Dec 17 '13 at 18:43
  • 1
    Your new idea works with 15. It doesn't work with 30, which last time I checked is a multiple of 15. It also gives a false positive for 31. It's just testing if the number is one less than a multiple of 16. – IronMensan Oct 06 '15 at 14:33
  • @IronMensan oh yes.. it was quick idea, but however bad idea. I see what is wrong now. Removing it. – ST3 Oct 07 '15 at 09:59
  • With `gcc -O3 -march=haswell`, both `time()` loops [get auto-vectorized with AVX2](http://goo.gl/dTQ3Yp). The `divisible15` one is significantly smaller, though. Like the scalar case, your multiply-and-test does much less work than gcc's `%15` implementation. gcc uses a similar magic-constant multiply, but it does a 32*32->64b full multiply, and uses both the high and low halves of the result for something. – Peter Cordes Feb 14 '16 at 22:38
25

Just use i % 15 == 0


  1. Since the compiler can easily see that 15 will never change it can feel free to make any optimization it wants on the mod operation. It is a compiler writers job to make this kind of optimization if they haven't thought of a better way to do it you won't.

  2. For example it is very easy to check if a number is divisible by 2 because you just check the first bit. Compiler writers know this and you could write the code yourself to check the first bit but especially a mature compiler will have people thinking about and working on these things for years. This type of optimization is a very simple one to make as it only requires changing an instruction or 2, optimizations like better register allocation are much harder to achieve

  3. One more thing to consider is that your compiler was written for the system that it is on, you code on the other hand is the same everywhere, if you write some strange code that may be just as fast on one system (probably still not faster) but on another system which has a special hardware optimization your code may lose by an order of magnitude. Since you wrote some esoteric code to check for the divisibility the compiler is unlikely to realize it can optimize to a single hardware op, writing the obvious thing to do makes life better for you and easier for the compiler.

  4. Since you haven't actually checked that the speed matters writing code the strange way will make the code very difficult to read for the next person and more error prone ( premature optimization is the root of all evil)

  5. It still works whether the input is 16, 32, or 64 bits since it doesn't rely on an bit manipulation.

  6. Even if the compiler writer hasn't implemented it, it is clearly possible for someone to implement it (even yourself)

aaronman
  • 18,343
  • 7
  • 63
  • 78
  • 1
    5. It still works whether the input is 16, 32, or 64 bits (or any other size) – Adam Rosenfield Sep 09 '13 at 21:41
  • 1
    1. What if the 15 is not constant? The compiler has little hope of optimizing it, but if the possible values all admit techniques parallel to this one, a solution with a table indexed by the modulus could be used. 3. True in general, but unlikely to apply to this case. 4. Easily solved with a comment. – R.. GitHub STOP HELPING ICE Sep 10 '13 at 02:20
  • @R.. what's ur point, the 15 is constant in this case, obviously a mod table only works if the input range is small, are you criticizing my answer BTW I can't tell – aaronman Sep 10 '13 at 02:26
  • Just providing some reasons it *might* be reasonable to manually optimize this like OP asked for. – R.. GitHub STOP HELPING ICE Sep 10 '13 at 02:39
  • Ahh, well 1. if it's not constant then mod is the obvious choice 3. got me there 4. comment may solve readability but it is still more error prone, in fact the reason the accepted answer got downvoted in the first place is because it didn't work, there isn't much chance of messing `number%15==0` up – aaronman Sep 10 '13 at 02:45
  • 1
    I have problems respecting anyone who throws around the "premature optimiztion ..." quote without stating that there are precise conditions (3/97%) in it, i e when it holds true and when it doesn't. I feel that for an ongoing, commercial project the 3% condition is valid - though many project managers would disagree. If you are exploring the subject for fun, pleasure and recreation why should you be bothered with the expression at all? – Olof Forshell Apr 09 '14 at 12:32
7

On a reasonably modern process, dividing by 15 shouldn't be that terrible. The AMD optimisation guide defines it based on the quotient (the value that is being divided), and it takes 8 + bit position of the most significant bit of the quotient. So if your numbers have the 63rd bit set, you end up with 71 cycles - which is quite a long instruction, of course. But for a 32-bit number with a few zeros in the top bits, we're talking 30-40 cycles. If the number fits in a 16-bit value, we the maximum is 23 cycles.

To get the remainder is one more clockcycle on top of that.

If you are doing this ALL the time, of course, you may find that this time is quite long, but I'm not sure there is a trivial way to avoid it.

Like others have said, compiler may be able to replace it with something better. But 15 does not, to my knowledge have an obvious fast solution (if you have 16 instead of 15, then we can use the trick of x & 15).

If it's a limited range, you could build a table [vector<bool> for example, which will store 1 bit per entry], but you'll quite soon run into the problem that the non-cached memory access takes just as long as a divide operation...

There are some interesting ways to figure out if a number divides by 3, 5 and so on by summing the digits, but unfortunately, those only work based on decimal digits, which involves a long sequence of divides.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • 3
    There is a trivial way to avoid it which is called "division by invariant integers using multiplication." user2623967's answer - an hour ahead of your's - points to this possible solution. It's been around for a long time and is implemented in some compilers, one of which is gcc. You need to do some reading up I think. – Olof Forshell Sep 10 '13 at 08:56
6

Here's another approach, which is probably slower than others, but uses only addition, bitwise-and and shift:

int divisible15(unsigned int x) {
        if (x==0) return 1;
        x = (x & 0x0f0f0f0f) + ((x&0xf0f0f0f0)>>4);
        x = (x & 0x00ff00ff) + ((x&0xff00ff00)>>8);
        x = (x & 0x0000ffff) + ((x&0xffff0000)>>16);
        x = (x & 0x0f) + ((x&0xf0)>>4);
        return x==15;
}

The idea is that divisibility by 15, in base 16, is like divisibility by 9 in base 10 - the sum of digits must be divisible by 15.
So the code sums up all hex digits (similarly to the way you count bits), and the sum must equal 15 (except for 0).

ugoren
  • 16,023
  • 3
  • 35
  • 65
  • You could shift before masking so each line would use the same constant twice. This would probably help on RISC architectures where large immediate constants have to be constructed in registers with multiple instructions (unlike x86, where any instruction can have a 32bit immediate). – Peter Cordes Feb 14 '16 at 22:29
  • 1
    @PeterCordes, I guess so. Something like `x = (x & 0x0f0f0f0f) + ((x>>4) & 0x0f0f0f0f);` – ugoren Feb 15 '16 at 12:57
1

Well, it's very easy to do in your head if you have the hex representation. Just sum all the digits, until you have a single digit. If the answer is '0xf', it's divisible by 15.

Example 0x3a98 : 3 + 0xa + 9 + 8 = 0x1e = 1 + 0xe = 0xf, so that's divisible by 15.

This works for all factors on X-1, where X is the base used to represent the number. (For smaller factors, the final digit must be divisible by the factor).

Don't expect this to be fast in code, though.

Roddy
  • 66,617
  • 42
  • 165
  • 277