2

Given this simple random generator:

int i, r = 0;
for (i = 0; i < 50; i++) {
    r = (1234 * r + 101) % (11000000);
    printf("%d\n", r);
}

Surprisingly, I get negative values!

101
124735
10923091
192507
6553739
-7620565
-10842517
-10763989
-1860437
8188139

Isn't supposed to be positive values? Can someone explain this?

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Don
  • 393
  • 3
  • 12
  • 5
    No, it's not supposed to be positive. If the first operand is negative the result will also be negative. – Sami Kuhmonen Jul 02 '17 at 19:27
  • 4
    integer overflow: `1234*6553739 + 101` is most likely more than `INT_MAX` – UnholySheep Jul 02 '17 at 19:27
  • @Don: `long` might not be large enough, `unsigned long long` is guaranteed to have at least 64 value bits, which is enough. – chqrlie Jul 02 '17 at 19:30
  • Related: [Does either ANSI C or ISO C specify what -5 % 10 should be?](https://stackoverflow.com/questions/3609572/does-either-ansi-c-or-iso-c-specify-what-5-10-should-be) – Martin R Jul 02 '17 at 19:31

3 Answers3

6

You get negative values because your program has integer arithmetics overflows. The behavior is actually undefined for signed type int. You should use a larger type to avoid this. Type unsigned long long is guaranteed to have at least 64 value bits, which is enough for the maximum intermediary result 1234 * 10999999 + 101.

int i;
unsigned long long r = 0;
for (i = 0; i < 50; i++) {
    r = (1234 * r + 101) % 11000000;
    printf("%llu\n", r);
}

rici commented that r does not need be a larger type since it's value is in range 0..10999999. This is not completely true as type int may be too small to handle such values. The range for int can be as small as -32767..32767.

Nevertheless, The intermediary computation must be performed with a larger type to avoid arithmetic overfow. Here is the corresponding code:

int i, r = 0;  // assuming 32-bit ints
for (i = 0; i < 50; i++) {
    r = (1234ULL * r + 101) % 11000000;
    printf("%d\n", r);
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • I agree. Thank you – Don Jul 02 '17 at 19:30
  • `unsigned` overflow actually has defined behavior (unlike signed overflow) – UnholySheep Jul 02 '17 at 19:31
  • It's ok for `r` to be an int; it's the intermediate value which needs to be larger (and preferably unsigned). `r = (1234ULL * r + 101) % 11000000;` – rici Jul 02 '17 at 19:35
  • @UnholySheep, but even `unsigned` 32bit arithmetic will give wrong results here, as the effect of having `2^32` modulus (before doing the final `11000000` modulus) is changing the output of the calculation and giving wrong results. He has two routes: 1) as the solution given here, forcing the calculations to be done in 64bit integers, before assigning to the integer variable or 2) as in other solution given (and here also), to do all the calculations in 64bit arithmetic, using for example `uint64_t` variables. – Luis Colorado Jul 03 '17 at 05:56
  • @LuisColorado `unsigned long long` is guaranteed to be at least 64 bits. Also OP only asked how to ensure that they get positive values from the modulus operation. (And it is not specified whether an overflow is considered a "wrong" result in this context or not) – UnholySheep Jul 03 '17 at 06:21
1

As you've seen in other answers, this behavior is due to overflow.

If you want to be able to detect stuff like this earlier, use gcc or clang's Undefined Behavior Sanitizer (UBSan).

$ /opt/clang+llvm-4.0.0-armv7a-linux-gnueabihf/bin/clang -fsanitize=undefined don.c 

$ ./a.out 
don.c:8:18: runtime error: signed integer overflow: 1234 * 10923091 cannot be represented in type 'int'

don.c, line 8, column 18 is the multiplication in this line: r = (1234*r +101) % (11000000);.

Brian Cain
  • 14,403
  • 3
  • 50
  • 88
0

You have to be careful, as your code is producing overflows, even if you do unsigned arithmetic.

Probably your int variable is a 32 bit integer which overflows after the number 2.147.483.647, and if you consider the worst case of your computation, you'll have 1.234*10.999.999 + 101 ==> 13.573.998.867, before calculating the modulus operation, and this will lead you to error.

The best thing you can do is to use 64 bit number for this kind of calculation not to overflow, with this sample code (you'll see different results even for your normal positive ones)

$ cat pru.c
#include <stdio.h>
#include <stdint.h>

int main()
{
    uint64_t i, r=0;
    for (i = 0; i < 50; i++) {
            r = (1234*r +101) % (11000000);
                printf("%llu\n", r);
    }
}

which results in:

$ pru
101
124735
10923091
4094395
3483531
8677355
4856171
8515115
2652011
5581675
1787051
5221035
7757291
2497195
1538731
6794155
1987371
10415915
5239211
8186475
4110251
1049835
8496491
1669995
3773931
4030955
2198571
7036715
4306411
1111275
7313451
4798635
3515691
4362795
4689131
387755
5489771
9377515
10853611
6356075
396651
5467435
3814891
10575595
4284331
6864555
860971
6438315
2880811
1920875

This is correct, as 1.234*10.999.999 + 101 ==> 13.573.998.867 will never overflow a uint64_t number (this is the maximum result you can have) and will produce correct results.

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31