-1

I know that, for as much as we want to believe computers to be unerring, transistors are not perfect and 1 + 1 will not always return 2, at a transistor level.

I also know that, to protect us from errors, most computers nowadays have redundancy, error detection and correction algorithms.

That being said, what are the chances of the following C++ program printing the wrong result, without warning? Is there even a chance?

#include <iostream>
using namespace std;

int main()
{
    int a = 1, b = 1;
    int sum = a + b;

    cout << "Sum = " << sum;

    return 0;
}

Let's assume we are using an average x64 $1000 laptop, as of 2020.

This question has a broader scope. We run billions of calculations per second, I want to know how much can go wrong in a complex program, on a theoretical level.

Daniele Molinari
  • 541
  • 6
  • 29
  • 3
    Try running that in a loop for a couple of months / years and see if you ever get anything but two? When you don't know, measure and research it :D – Carl Mar 14 '20 at 14:53
  • 1
    I would categorize the chances for 1 + 1 sum giving the wrong result as: _unlikely_ – Eljay Mar 14 '20 at 14:56
  • It's a good idea, but if I did this and never got anything different, it would not prove that it can not happen. I could just "get lucky" during the test. Anyway, this question has a broader scope. We run billions of calculations per second, I want to know how much can go wrong in a complex program, on a theoretical level. – Daniele Molinari Mar 14 '20 at 14:57
  • Will your laptop be send to space without protection? It's very hard to compute the "chances" of a neutron particle colliding with the memory cell that holds the result of the computation at the exact time between computation and output, but it could happen. – KamilCuk Mar 14 '20 at 15:20
  • This might be a question for https://electronics.stackexchange.com/ – Blastfurnace Mar 14 '20 at 15:31
  • If you have access to academic publications, google for "soft error rate empirical". Be sure to specify "sea level" or "aviation" for you relevant application. I don't have access to these journals at the moment without paying so I could only speculate. – JohnFilleau Mar 14 '20 at 15:35
  • Your concrete example might not be a good one for the general problem you are asking about, because the operation `1 + 1` will be done at compile-time, so there will be no addition at runtime. I don't know what the most likely error source for CPU operations is, but with regards to memory you might be interested in [Cosmic Rays: what is the probability they will affect a program?](https://stackoverflow.com/questions/2580933/cosmic-rays-what-is-the-probability-they-will-affect-a-program) – walnut Mar 14 '20 at 15:57
  • Of interest: [Alpha particles from package decay](https://en.wikipedia.org/wiki/Soft_error#Alpha_particles_from_package_decay). – Eric Postpischil Mar 14 '20 at 16:25
  • 1
    Of course for most people this is a “theoretical” question; the natural incidence of such errors is so rare as to be negligible. But it is in fact a practical question, both for exotic uses of computers (such as in space) and because errors can be induced by attackers: Inducing faults is one way of attacking secure computing, such as attempting to learn information about a cryptographic key. – Eric Postpischil Mar 14 '20 at 16:27
  • @walnut Added `a` and `b` variables so that computation should happen at runtime. If that is not is enough, let's assume that they will come from user input. Happy to know that for hardcoded values calculation happens at compile time! – Daniele Molinari Mar 14 '20 at 16:59

1 Answers1

1

Yes, there is a chance of 1 + 1 yielding something other than 2. The chance of that happening is so close to zero that it cannot be measured.

This is so for the following reasons:

  1. First of all, the likelihood of things going wrong at the quantum level are infinitesimally low. The term "glitches" does exist in IT, but in the vast majority of cases it turns out to be due to some hardware malfunction like a network cable not making perfect contact. In the remaining extremely small percentage of cases where the glitch has been observed in software, it is simply used as just another term for "we are not quite sure why this happened". It is most likely due to a logic bug, or a multithreading issue, or some other non-quantum effect. Glitches due to quantum uncertainty are not happening at any rate that has given rise to any need to be given any consideration in our profession.

  2. The computer system on which you are going to run this little test program of yours is constantly running megabytes of code that perform various other functions, all of which rely on 1+1 or any other computation always yielding the correct result. If the slightest mishap was to ever happen, the computer would crash miserably and spectacularly. So, your puny little program does not even need to run: your computer and hundreds of millions of computers worldwide working flawlessly around the clock is proof that 1+1 is always computed as 2 with an extremely high degree of certainty.

Aykhan Hagverdili
  • 28,141
  • 6
  • 41
  • 93
Mike Nakis
  • 56,297
  • 11
  • 110
  • 142
  • My question is not just about the issues at a quantum level, but about the probability of error of the whole "computer system", when executing a very simple task. Computer system being the sum of hardware and software (OS). – Daniele Molinari Mar 14 '20 at 15:17
  • Well, I am afraid that in that case the question is too broad and cannot be answered. It depends on the complexity of the software, the variety of the software and therefore the interactions between subsystems, the CPU load at which the software is running, etc. Nobody has an answer for that, let alone an answer for a consumer product which was put together within the last few months and keeps receiving updates every couple of weeks or so. – Mike Nakis Mar 14 '20 at 15:24
  • 1
    I mean "on average". But I understand your point! I will leave this question open for a while and maybe provide a bounty. The most interesting answer will win. – Daniele Molinari Mar 14 '20 at 15:30
  • Re “If the slightest mishap was to ever happen, the computer would crash miserably and spectacularly”: No, many errors would go unnoticed in the short term, some in the long term. Even crashes would not be spectacular. In many production systems, a process crash would result in the system restarting the process and logging it as a statistic. The computer and operating system itself would not crash. – Eric Postpischil Mar 14 '20 at 16:22
  • @EricPostpischil true to a certain extent, but not enough to make a difference. By far the most frequent computation performed by your computer is adding an offset to the stack pointer when moving values to/from the machine stack, and adding an offset to the instruction pointer to jump to another location. If any of those were to fail, you would immediately have a crash. If they were to fail in the kernel, (or in a kernel-mode driver,) you would immediately get an unrecoverable error, also known as a "Blue Screen of Death". – Mike Nakis Mar 14 '20 at 17:52