16

I need to divide numbers represented as digits in byte arrays with non standard amount of bytes. It maybe 5 bytes or 1 GB or more. Division should be done with numbers represented as byte arrays, without any conversions to numbers.

Tyler Durden
  • 11,156
  • 9
  • 64
  • 126
Kosmo零
  • 4,001
  • 9
  • 45
  • 88
  • http://en.wikipedia.org/wiki/Barrett_reduction – Regenschein Jun 26 '13 at 12:23
  • 2
    Something like [Java's BigInteger](http://docs.oracle.com/javase/7/docs/api/java/math/BigInteger.html)? – Bernhard Barker Jun 26 '13 at 14:00
  • 1
    Barrett reduction calculates a modulus, not a quotient. – Tyler Durden Jun 26 '13 at 14:00
  • Could Java's BigInteger handle gigs of bytes? – Kosmo零 Jun 26 '13 at 14:12
  • The OP does not say he is using Java. Java's implementation of arbitrary precision math is relatively inefficient, but could handle billions of digits if you had enough memory. It would take a long time to compute the quotient. – Tyler Durden Jun 26 '13 at 14:17
  • 3
    For generic questions like this you should be using the Wikipedia and coming here AFTER you have read the wikipedia and tried something. – Tyler Durden Jun 26 '13 at 14:18
  • @Kosmos It should support as big a number as will fit into memory. But I was referencing it more as a starting point. Java's API code is open source, so you can look at how they did it. – Bernhard Barker Jun 26 '13 at 14:24
  • possible duplicate of [Algorithm for dividing very large numbers](http://stackoverflow.com/questions/2884172/algorithm-for-dividing-very-large-numbers) – Tyler Durden Jun 26 '13 at 14:28
  • 1
    wikipedia doesn't answer on questions what the fastest one. I no need divisions that should be running for a days. – Kosmo零 Jun 26 '13 at 14:32
  • `BigInteger` is a pretty bad thing to use if you want your operations to be fast. Multiplication and division are done schoolbook in the canonical implementation. – tmyklebu Jun 26 '13 at 15:34
  • 1
    @Tyler: ... and it obtains the remainder by first computing the quotient, and then subtracting off the appropriate multiple of the divisor. –  Oct 05 '15 at 07:50

2 Answers2

19

Divide-and-conquer division winds up being a whole lot faster than the schoolbook method for really big integers.

GMP is a state-of-the-art big-number library. For just about everything, it has several implementations of different algorithms that are each tuned for specific operand sizes.

Here is GMP's "division algorithms" documentation. The algorithm descriptions are a little bit terse, but they at least give you something to google when you want to know more.

Brent and Zimmermann's Modern Computer Arithmetic is a good book on the theory and implementation of big-number arithmetic. Probably worth a read if you want to know what's known.

tmyklebu
  • 13,915
  • 3
  • 28
  • 57
  • Yes, but... as I said these algorithms are a LOT more complicated than Algorithm D. The main reason for doing divide and conquer is so that you can use the Karatsuba algorithm, and let me tell you writing all this plus a Karatsuba implementation will be a LOT of work, and I mean a LOTTA LOTTA work. I don't know how good a programmer the OP is, but even a very good programmer could spend MONTHS writing a correct implementation using divide and conquer. – Tyler Durden Jun 26 '13 at 16:37
  • @TylerDurden: Well, he asked about "crazy large numbers." Karatsuba isn't bad on its own, since you don't run into power-of-two issues. It starts becoming a lot of work when you start wanting to implement Toom-Cook and the various FFT-based methods, then figuring out the crossover points. This is why you use GMP for that :) – tmyklebu Jun 26 '13 at 17:43
  • @TylerDurden: And divide-and-conquer division isn't too bad at all once you have a fast multiplication black-box. Take the high half of the numerator and denominator, recursively divide, and then do a little cleanup afterward. Again, tuning and finding the crossover point between schoolbook and divide-and-conquer is a fair bit of work. – tmyklebu Jun 26 '13 at 17:45
11

The standard long division algorithm, which is similar to grade school long division is Algorithm D described in Knuth 4.3.1. Knuth has an extensive discussion of division in that section of his book. The upshot of this that there are faster methods than Algorithm D but they are not a whole lot faster and they are a lot more complicated than Algorithm D.

If you determined to get the fastest possible algorithm, you can resort to what is known as the SRT algorithm.

All of this and more is covered by the way on the Wikipedia Division Algorithm.

Tyler Durden
  • 11,156
  • 9
  • 64
  • 126
  • Of the algorithms listed on the wikipedia link, you'll probably find **long division** to be the most useful. Be careful of the notation though. **D(0)** indicates the least significant value in the number, while the left shift suggests that the numbers are stored big endian wise (which means the LSD should be at **D(n-1)**). – Python Cheese Apr 13 '17 at 21:51