Wrong result when calculating the sum of 1 to n using loop

Question

I want to calculate sum of 1 to n using loop and this is my C code:

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]){
    long sum = 0;
    int n = atoi(argv[1]);
    for(int i = 1; i<= n; i++)
        sum += i;
    printf("Sum of serial from 1 to %d = %ld\n", n, sum);
    return 0;
}

when I run the program with small n, the result is correct but when I run with large number this is wrong. Here is the result when I run this:

./sum 100000
Sum of serial from 1 to 100000 = 705082704

What did you expect to get instead? Do you know what is the largest number that can be represented by the `long` type? — mkrieger1, Mar 13 '23 at 13:05
you overflow the `sum` and your program invokes undefined behaviour — 0___________, Mar 13 '23 at 13:09
Please [edit] your question and add details about your platform, e.g. processor, OS, compiler including version number. — Bodo, Mar 13 '23 at 13:14

score 5 · Answer 1 · answered Mar 13 '23 at 13:09

It looks like long in your environment is 32-bit long and the maximum value it can hold is 2,147,483,647. The sum of integers from 1 to 100,000 is 5,000,050,000, which exceeds the limit.

In C99 and later, you can use long long type, whose maximum value is at least 9,223,372,036,854,775,807.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]){
    long long sum = 0;
    int n = atoi(argv[1]);
    for(int i = 1; i<= n; i++)
        sum += i;
    printf("Sum of serial from 1 to %d = %lld\n", n, sum); // use %lld to print "long long" value
    return 0;
}

int64_t type in inttypes.h may be supported by more compilers.

#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>

int main(int argc, char *argv[]){
    int64_t sum = 0;
    int n = atoi(argv[1]);
    for(int i = 1; i<= n; i++)
        sum += i;
    printf("Sum of serial from 1 to %d = %" PRId64 "\n", n, sum); // use PRId64 macro to print "int64_t" value
    return 0;
}

You could also use `unsigned long long` which would go up to 18,446,744,073,709,551,615!!! — coulomb, Mar 13 '23 at 13:56
True, but `unsigned` it would mathematically work for summing `1` to `n` for `n <= 4294967295`, which just so happens to be the limit for an unsigned `int`. — Rogue, Mar 13 '23 at 13:56

score 1 · Answer 2 · answered Mar 13 '23 at 13:34

Use unsigned large types to avoid undefined behaviour (UB) and easily detect the problem.

#include <stdint.h>

unsigned long long sum(unsigned long num)
{
    unsigned long long result = 0, previous = 0;

    for(unsigned long i = 1; i <= num; i++)
    {
        result += i;
        if(result < previous)
        {
            printf("Wraparound at : %lu\n", i);
            break;
        }
        previous = result;
    }
    return result;
}

int main(void)
{
    printf("%llu\n", sum(1000000));
    printf("%llu\n", sum(6074001001UL));
}

https://ideone.com/g6isU6

Minor: The `L` in `6074001001UL` not needed, yet is nicely illustrative. — chux - Reinstate Monica, Mar 13 '23 at 15:06

Pignotto · Answer 3 · 2023-03-13T17:54:26.230

The C standard only guarantees at least 16 bits of width for int and 32 bits for long

The sum of numbers from 1 to n, where n is a k bit number, needs 2*k bits at most and 2*k-1 bits at least

100000 is already out of range for that minimum size of int, needing 17 bits. A safe choice for it would be using long (but in general you should sanitize your input to fit in the chosen type)

Its sum needs at least 33 bits. A safe choice for it would be using long long (C99)

Or you could use the types in the stdint header for better control (C99)

Also, please notice:

atoi causes UB when the value is not representable;
for (int i=1; i<=n; i++) causes UB when n==INT_MAX (*)
sum += i causes UB when sum + i is not representable

(*): using unsigned instead causes an infinite loop which if it has no side effects is UB

score 0 · Answer 4 · answered Mar 13 '23 at 13:07

0

the data type of the variable "sum" is "long", which has a maximum value of 2^31-1, or approximately 2.1 billion. you can fix this issue by using data type that can store larger values, such as "long long" or "unsigned long long".

answered Mar 13 '23 at 13:07

mhmmdamiwn

146
9

2

Re “has a maximum value of”: The maximum value of `long` may vary from one C implementation to another. It is not fixed at 2^31−1. When writing Stack Overflow answers to C questions, it is preferable to speak to the C language generally, usually as defined by the C standard, not just the properties of specific C implementations (except for questions pertaining to specific implementations, of course). – Eric Postpischil Mar 13 '23 at 13:16

Jon Green · Answer 5 · 2023-03-13T16:44:41.403

I'm guessing you're running it on a 32-bit machine, or using a 32-bit compiler. You need to force 64-bit arithmetic, because you overflowed a 32-bit signed container with a number that was 33 bits unsigned.

#include <inttypes.h> at the top, to get access to fixed-size types and printf() macros.
Change the types of sum, i and n to uint64_t: 64-bit unsigned int.
Instead of atoi(), use strtoull() (allows users to specify numbers in bases other than decimal, and gives a guaranteed unsigned long long) - or, better still, sscanf(argv[1], SCNu64, &n), which you know will guarantee to parse a uint64_t.
Change your printf() line to printf("Sum of serial from 1 to " PRIu64 " = " PRIu64 "\n", n, sum); so that you see the full 64-bit values.

It should all work after that. (Arguably, you could get away with uint32_t for n, but you may as well use 64-bit throughout.)

Incidentally, you should check that argc is at least 2 before looking at argv[1], otherwise the outcome will be (1) undefined and (2) definitely undesirable. Checking argv[1] consists entirely of valid numerals would be a good idea too, or that sscanf() returns 1, which indicates it successfully parsed the one item requested.

Giving a list of source code changes without explaining why or what they do or what issue caused the undesired output is not a good answer. — Eric Postpischil, Mar 13 '23 at 13:12

chux - Reinstate Monica · Answer 6 · 2023-03-16T21:49:59.320

0

Overflow

Code experienced signed integer overflow, which is undefined behavior (UB).

Instead code could detect the upcoming problem.

for(int i = 1; i<= n; i++) {
  if (i > LONG_MAX - sum) {
    fprintf(stderr, "Sum is out of range\n");
    break;
  }
  sum += i;
}

Code can use wider types to forestall the problem, yet the problem can remain for larger n.
Note: even INT_MAX == LLONG_MAX is possible.

edited Mar 16 '23 at 21:49

answered Mar 13 '23 at 15:11

chux - Reinstate Monica

143,097
13
135
256

Signed integer overflow is not (or should not be) _undefined behavior_, but rather _implementation defined_ behavior, which means determinate behavior, but that must be documented & defined by the compiler implementor. – Luis Colorado Mar 16 '23 at 20:43
@LuisColorado OP's `sum += i;` risks _signed_ integer overflow which, as I see it, is [UB](https://stackoverflow.com/a/36342550/2410359) per C spec. What is your reference that it is _implementation defined behavior_? – chux - Reinstate Monica Mar 16 '23 at 21:22
IMHO (I'm not sure as I don't read the standard every time I have this kind of issues) the standard defines two such unclear situations: what is called _undefined behaviour_ that states that no expected valid value can be expected, because the event that triggered it has no meaning in the language at all, and unexpected behaviour, because there exist different ways to handle it by how different architectures deal with it. In this case, despite there can be several possible different results, the one that occurs is determined by how things are done, and the standard specifies that... – Luis Colorado Mar 16 '23 at 21:34
the implementation should select one way to implement it, and document it, leading to what is called _implementation defined behavior_. And I think the `signed int` overflow is one of such cases (I repeat, I have not searched the standard to check it) is one of these cases in which the implementation should document what happens (for `unsigned`s there's only one way to overflow, but on `signed`, several possibilities are considered, and the implementations are free to select the more efficient, but once documented and selected, it stops being _undefined_ to be _implementation defined_ – Luis Colorado Mar 16 '23 at 21:39
@LuisColorado C also has "EXAMPLE An example of undefined behavior is the behavior on integer overflow.". Spec is clear enough that signed integer overflow is UB. Compilers often do not have signed integer overflow as implementation defined behavior to allow for various optimizations. If a compiler followed the _should select one way to implement it_, it can result in loss of performance. I suspect the industry favors performance over specificity as popular compilers take full advantage of UB. Do you have a compiler in mind that does specify behavior on signed integer overflow? – chux - Reinstate Monica Mar 16 '23 at 21:47
Well, I have been capable to answer the question giving a reasoning for the deterministic behavior of the program, and it's output. This is something that I observe in every machine that uses two's complement arithmetic with integers since 1987. So does undeterministic in this context mean something still to come? is somebody planning to write a quantum C compiler that integer overflows in an undeterministic and really undefined way? The reason for a different treatment of signed and unsigned (which is perfectly deterministic and not undefined on overflow) is that there are.... – Luis Colorado Mar 17 '23 at 11:57
several, diferent ways to control these overflows. It only affects portability between architectures, or perhaps gives compiler manufacturer to implement them in different ways. Somebody must be hating or has not fully understood that when you overflow a two's complement register, you start comming back from the negative side, and it seems to be a pain to calculate how things can happen in a deterministic, reliable and predictable way. IMHO, using integer overflow to show undefined behaviour is just some unfortunate example that doesn't give a full answer to the question of _undefined_ – Luis Colorado Mar 17 '23 at 12:01
(undefined can be defined just by adding the way it behaves --a definition-- in the documentation of the compiler, It will never be undefined anymore) – Luis Colorado Mar 17 '23 at 12:04
@LuisColorado The classic example of UB is `for (int i = 0; i >= 0; i++) { ... }` optimized to `for (int i = 0; 1; i++) { ... }`. You may find it more enlightening to post a question about your thoughts about a _signed integer overflow_ than to try to say, as a comment about UB here. Because after all this discussion, for better or worse, C in 2023 has _signed integer overflow_ as UB, even if it appears a poor spec choice. – chux - Reinstate Monica Mar 17 '23 at 18:27

Luis Colorado · Answer 7 · 2023-03-16T21:20:03.847

That sum is n*(n+1)/2, which results in 100000*(100000 + 1) / 2 = 5000050000. That is far over the maximum long 32 bit integer (only in case your long is 32 bit, which seems to be the case, see below) operating in two's complement, you should have gotten that amount minus 4294967296 (2^32) or 705082704 (which is the value you are showing printed out)

Use

#include <stdint.h>

and declare your sum variable as uint64_t, that grows up to 18446744073709551615 (the maximum value of that type) and avoids the overflow until you reach the value 6074001001 (by adding, don't make the calculation with the formula I use above that you will get an overflow before dividing by 2 and will get a wrong result again) that will overflow (but it overflows before the 32 bit int you use for the loop index i, which overflows at 2147483648)

Wrong result when calculating the sum of 1 to n using loop

7 Answers7