0

I have the following C function which performs some multiplications and bit-shifting on 32-bit unsigned integers on GPU. Let's say I start with values X = 0 and C = 2:

void MWC64X_Step(global mwc64x_state_t *s)
{
    uint X=s->x, C=s->c;
    printf("X = %u, C = %u, ", X, C);
    uint Xn=MWC64X_A*X+C;
    uint carry=(uint)(Xn<C); //The (Xn<C) will be zero or one for scalar
    uint Cn=mad_hi(MWC64X_A,X,carry);
    printf("Xn = %u, Cn = %u, ", Xn, Cn);
    s->x=Xn;
    s->c=Cn;
}

with

typedef struct{ uint x; uint c; } mwc64x_state_t;

enum{ MWC64X_A = 4294883355U };
enum{ MWC64X_M = 18446383549859758079UL };

The function is part of a package I found online. I am curious to understand if X becomes very very large, of the order of 2^32, since Xn is declared as an unsigned int (32 bits on my platform), then how is it calculated? Does Xn just keep the value of modulo MAX_INT_32, or does it perform some other operation(s)?

Results I get from a few consecutive runs with X=0 and C=2:

Xn = 2, Cn = 0, 
Xn = 4294799414, Cn = 1, 
Xn = 1207281075, Cn = 4294715476. <--- 3rd iteration

Thanks,

Amine
  • 134
  • 11
  • @kaylum, partly. I suspected it calculated the multiplication by rolling-over using modulo 2^32, but as it turns out in my example, I should be getting Xn=1207281664 in my 3rd iteration above, not 1207281075. I can't see why! – Amine Oct 26 '20 at 00:59
  • What compiler are you using? What are the actual sizes of `MWC64X_A` and `MWC64X_M`? The sizes actually emitted by something like `printf( "sizeof( MWC64X_M ) = %zd\n", sizeof( MWC64X_M ) )`. What are their actual values? What is the actual result of just the multiplication? – Andrew Henle Oct 26 '20 at 01:23
  • @AndrewHenle MWC64X_M is 8 bytes, MWC64X_A is 4 bytes. I use an OpenCL compiler with CL1.2 on top of LLVM clang 9 as a compiler. The actual result I get for the iteration for inputs (X= 4294799414, C=1) is (Xn = 1207281075, Cn = 4294715476) – Amine Oct 26 '20 at 01:30
  • It is not clear how you figure you should have gotten 1207281664 in the third iteration. I get 1207281075 (see my answer) by independent calculation. – David K Oct 26 '20 at 01:54

1 Answers1

1

The constant MWC64X_A = 4294883355U is congruent to -83941 mod 2^32.

2 * (-83941) + 0 = -167882, which is congruent to 4294799414 mod 2^32.

(-167882) * (-83941) = 14092182962, which is congruent to 1207281074 mod 2^32 (according to Excel), and 1207281074 + 1 = 1207281075. So the results you get are consistent with taking the results of multiplication modulo 2^32 every time.

David K
  • 3,147
  • 2
  • 13
  • 19
  • I calculated (4294883355U* 4294799414)%MAX_UNSIGNED_INT which gave me the 1207281664 value. I can't see why I get a different value from what you get, since all I do is take the modulo at the end of my calculations, whereas you take yours at the beginning prior to multiplication. – Amine Oct 26 '20 at 11:24
  • My guess is you lost precision in the multiplication `4294883355U*4294799414`. I get the same end result in Excel, which displays 4294883355*4294799414 as 18445662516252400000.00 (note the decimal point; I think it switched to floating-point representation internally). The end result is 1207281664 = 589493 * 2048, so it looks like we lose 10 binary places of accuracy when multiplying 4294883355*4294799414. (The result should have been divisible by 2 but not divisible by 4.) – David K Oct 26 '20 at 11:45