2

Couple things:

  • I am a university student
  • This is not a homework problem (I'm curious about how bignum libraries work)
  • All the code snippets here can be compiled with gcc -lgmp

Here is my implementation which keeps returning incorrect answers. I've tried going through with gdb & checking individual ops with GMP and marking which operations are definitely failing in the code, but I haven't managed to find specifically what's going wrong.

Beyond that, I can't identify where the issue is.

uint64_t *multiply(u256 c, u256 a, uint64_t alen, u256 b, uint64_t blen) {
    if (alen == 1 || blen == 1) {
        u2_set(c, 0);
        c[0] = a[0] * b[0];
        c[1] = c[0] >> 32LU;
        c[0] &= MASK;
        return c;
    } else {
        uint64_t al, bl;
        uint64_t M = min(alen, blen);
        al = ceil(alen, 2);
        bl = ceil(blen, 2);
        uint64_t m = M / 2;
        u256 low1, low2, high1, high2, z0, z1, z2;
        u2_set(low1, 0);
        u2_set(low2, 0);
        u2_set(high1, 0);
        u2_set(high2, 0);
        memcpy(low1, a, sizeof(uint64_t) * al);
        memcpy(low2, b, sizeof(uint64_t) * bl);
        memcpy(high1, &(a[al]), sizeof(uint64_t) * (alen - al));
        memcpy(high2, &(b[bl]), sizeof(uint64_t) * (blen - bl));

        /* Starts ok, goes bad immediately. Mostly ok. */
        multiply(z0, low1, al, low2, bl);

        u2_add(low1, low1, high1); /* CHECKED */
        u2_add(low2, low2, high2); /* CHECKED */

        multiply(z1, low1, al, low2, bl); /* Starts ok, goes bad immediately. */

        /* Bad but starts & usually is ok */
        multiply(z2, high1, (alen - al), high2, (blen - bl));

        /* ALL GOOD */
        u256 o;
        u2_sub(z1, z1, z2);
        u2_sub(z1, z1, z0);
        u2_shl(z1, z1, m);
        u2_shl(z2, z2, m * 2);
        u2_add(o, z2, u2_add(z1, z1, z0));

        u2_copy(c, o);
        return c;
    }
}

To be honest, the worst part is not having test vectors for the intermediate values. Beyond the ones in the wikipedia page example (which I seem to pass), I can't find anything.

u256 is an array of 8 uint64_t's. I'm only using the lower 32 bits of those integers to make masking and carries a bit easier. The idea is to do the Karatsuba multiplication base 232.

The rest of the code can be found here.

Any help would be appreciated.

EDIT:

Example of what's wrong:

a = 89637787616193205010487395076171907549998432494691382753495063501471509582488
b = 62570744880332518477645686292054860709644349224438728618405612469614908709348

Multiplying the 2 should return

16439926136035242788212517705012640494860769275971090459163708006828967815008

Instead it returns

38696864898835518778416

That specific test can be run by compiling this code: here

with gcc and -lgmp

ceil is defined as so:

#define ceil(a, b) (((a) + (b) - 1LU) / (b))

using this test code, which checks stuff a bit more in-depth, I believe I narrowed the problem down some.

It occurs on the m=4 multiply of z0 = low1 * low2. The two values multiplied are 100000000000 and 100000000000, and the expected result is 10000000000000000000000 while it ends up being 1478053504202008644. Running the above test code will reproduce this. I assume this means my karatsuba implementation is simply broken, as the components aren't throwing any errors and it's specifically failing on the m=2 multiplications. I'm checking the m=1 multiplications and they seem fine.

Lev Knoblock
  • 611
  • 2
  • 6
  • 20
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/225227/discussion-between-greybeard-and-lev-knoblock). – greybeard Nov 27 '20 at 18:22
  • too many things might going on as you are mixing more libs together ... Are you sure the format is correctly transformed between each step ... for example you use GMP for conversion between string and number then convert that to u256 and then to array of uint64 ... The same goes for printing you can have bug at any point (truncating, different format or base, different endianness ...) much easier is to get rid of the stuff and do all from scratch without libs. For starters use hex numbers for printing and setting the variables. – Spektre Nov 28 '20 at 12:28
  • see this [ALU32](https://stackoverflow.com/a/26603589/2521214) you can tweak it to make your 64bit ALU so you do not need to convert between libs with different format. It provides the basic stuff you need like add,adc,mul, bitshifts ... with carry so they can be stacked up on your arrays instead of using some fixed 256bit data type. Here [Fast bignum square computation](https://stackoverflow.com/q/18465326/2521214) exmaple of karatsuba multiplication and more using the ALU operations... ALU is C++ but you can easily port that to C++ just move the carry to global viarables ... – Spektre Nov 28 '20 at 12:35
  • @Spektre I've checked all the gmp stuff multiple times, and I kind of need it for checking the math, but other than that I'm not using any other libraries. Thanks for the suggestions! I'll take a look. – Lev Knoblock Nov 28 '20 at 17:15
  • @Spektre as for the variable length array vs 256 bit type, I'm specifically working towards making a fixed-width implementation. – Lev Knoblock Nov 28 '20 at 17:23
  • @LevKnoblock what exactly is `u256`? its an array of 8 x 32 bit uints ? if yes then your operation `+,-` have wrong computation of carry as `c[0] = a[0] + b[0]; carry = c[0] >> 32LU != 0;` will be always zero. if they are 8 x 64 bit you wasting space and compute inconsistelly.... as you are not masking out the carry after usage ... – Spektre Nov 28 '20 at 17:44
  • @LevKnoblock I compute carry for `c=a+b` addition like this: `c=a+b; cy=DWORD(((a &1)+(b &1) )>> 1); cy=DWORD(((a>>1)+(b>>1)+cy)>>31);` where `DWORD` is the datatype of `a,b,c` (32 bit uint). – Spektre Nov 28 '20 at 17:50
  • @Spektre 8x64 bit, and if you look at my +,- functions I am masking out the carries. – Lev Knoblock Nov 28 '20 at 17:52
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/225263/discussion-between-lev-knoblock-and-spektre). – Lev Knoblock Nov 28 '20 at 17:53

0 Answers0