11

This is the context of my program.

A function has 50% chance to do nothing, 50% to call itself twice. What is the probability that the program will finish?

I wrote this piece of code, and it works great apparently. The answer which may not be obvious to everyone is that this program has 100% chance to finish. But there is a StackOverflowError (how convenient ;) ) when I run this program, occuring in Math.Random(). Could someone point to me where does it come from, and tell me if maybe my code is wrong?

static int bestDepth =0;
static int numberOfPrograms =0;
@Test
public void testProba(){
   for(int i = 0; i <1000; i++){
       long time = System.currentTimeMillis();
       bestDepth = 0;
       numberOfPrograms = 0;
       loop(0);
       LOGGER.info("Best depth:"+ bestDepth +" in "+(System.currentTimeMillis()-time)+"ms");
   }
}

public boolean loop(int depth){
    numberOfPrograms++;
    if(depth> bestDepth){
        bestDepth = depth;
    }
    if(proba()){
        return true;
    }
    else{
        return loop(depth + 1) && loop(depth + 1);
    }
}

public boolean proba(){
    return Math.random()>0.5;
}

.

java.lang.StackOverflowError
at java.util.Random.nextDouble(Random.java:394)
at java.lang.Math.random(Math.java:695)

. I suspect the stack and the amount of function in it is limited, but I don't really see the problem here.

Any advice or clue are obviously welcome.

Fabien

EDIT: Thanks for your answers, I ran it with java -Xss4m and it worked great.

Raedwald
  • 46,613
  • 43
  • 151
  • 237
Fabinout
  • 878
  • 7
  • 25
  • I suspect it would terminate most of the time, but since you are calling it 1000 times you are pretty much guaranteeing yourself to get a stack overflow from the unlucky tons of recursions. – Kevin DiTraglia Aug 08 '13 at 12:16
  • 11
    That it happens at `random()` is just a coincidence. The cause is the deep recursion in `loop()`. – kiheru Aug 08 '13 at 12:17
  • 7
    The problem with the logic behind "it always finishes" is the same as behind the logic that tells you that [you can always win by doubling your bet after a loss](http://en.wikipedia.org/wiki/Martingale_%28betting_system%29): bad luck is bound to have a streak that would be long enough to overflow your stack, or empty your purse. – Sergey Kalinichenko Aug 08 '13 at 12:18
  • 3
    StackOverflowErrors frequently show up in a leaf function called by the real culprit. The real culprit (`loop`) *nearly* fills the stack, then it calls something else and it's *that* call that finally crosses the limit. That's quite common really. – harold Aug 08 '13 at 12:18
  • @kiheru Well I ran it thousands of times (literally) and it always occurred at this very line. Is there any way to clear the stack to get rid of this problem? – Fabinout Aug 08 '13 at 12:20
  • Can you tell the depth at which the Exception takes places? There is a known problem with java recursion if you recur too deep and keep stacking variables. – Lake Aug 08 '13 at 12:21
  • @Fabinout You could throw it in a try / catch, but then you are never going to get a true depth of how far it goes. – Kevin DiTraglia Aug 08 '13 at 12:21
  • @Fabinout why are you returning true only if two sequential runs return true? Why not just one? – supersam654 Aug 08 '13 at 12:22
  • Is it possible that the stack can accept an even number of functions, and that Math.random() is in fact always the uneven-th function added in the stack? – Fabinout Aug 08 '13 at 12:22
  • @dasblinkenlight Well, if you had infinite amount of money, you'd always win, that's theoretical, but it's true. But yeah, this analogy works pretty well in my case :) – Fabinout Aug 08 '13 at 12:26
  • @Fabinout You can't clear the call stack. You always get the exception at the same place because the maximum stack depth is always the same. – kiheru Aug 08 '13 at 12:26
  • 3
    Not related to your issue at all, but `Math.random()>0.5` should be `Math.random()>=0.5`. `random` returns a value from 0 inclusive to 1 exclusive, so as it is now there's actually slightly less than 50% chance a recursion will occur. – Syon Aug 08 '13 at 12:41
  • @Syon Well it's true. – Fabinout Aug 08 '13 at 12:42
  • Sorry, meant to say slightly **greater** than 50% chance. – Syon Aug 08 '13 at 12:48
  • @Syon Indeed, p(x>0.5) when x in [0-1[ is infinitely lower than 1/2. – Fabinout Aug 08 '13 at 12:54
  • Probability is a funny thing. 50% chance, with no further qualification, basically means that in the past, present and future of the universe, you'll find as many "yes"es as "no"s. What it doesn't say is that you will definitely see a particular balance of outcomes on a scale smaller than that (although you're likely to). For example, although incredibly unlikely and suspicious, it is possible for someone to flip a coin a thousand times and have it always land on tails. – cHao Aug 08 '13 at 13:03
  • @cHao Some people already won twice the big lottery prize (millions and millions). – Fabinout Aug 08 '13 at 13:18
  • What an interesting problem. How do you show that the probability of terminating is 1? How do you calculate the expected depth? I can't seem to figure either of those questions out... – BlueRaja - Danny Pflughoeft Aug 08 '13 at 13:38
  • My point is, the only thing you can predict is that the universe will hand out as many "yes"es as "no"s. If at any point you see more "no"s then "yes"es, the program will halt. But if you consistently got back "yes", "no", "yes", "no" -- which sits right at 50% -- the program would happily continue forever. At that point, it's the *machine* that fails. Memory is finite, and absent some miracle of technology, you *will* eventually run out of space. (And the machine's death doesn't mean the program finished; it just means physics intervened.) – cHao Aug 08 '13 at 13:42
  • @A.Webb: Sorry, I don't see how that applies. After asking, I realized that this problem is nearly equivalent to asking "What is the probability of eventually seeing more total heads than tails," which explains why the probability is 1. But that doesn't explain how to calculate the expected depth *(nor, as far as I can tell, does your link)* – BlueRaja - Danny Pflughoeft Aug 08 '13 at 13:48
  • I've now asked a question about this [here](http://math.stackexchange.com/questions/462824). – BlueRaja - Danny Pflughoeft Aug 08 '13 at 14:22
  • 1
    @dasblinkenlight I just realised that my program is wrong. When I use " return loop(depth + 1) && loop(depth + 1);", it doesn't evaluate the second item if the first one is false. I need to rewrite it. – Fabinout Aug 08 '13 at 14:55
  • @BlueRaja-DannyPflughoeft Sorry, I read too quickly. I thought you were asking what is the probability of blowing a fixed stack depth. Your question is much more interesting. I look forward to seeing the response on math stackexchange. – A. Webb Aug 08 '13 at 15:03
  • @BlueRaja-DannyPflughoeft I'm glad my question raised another bunch of interesting questions :) – Fabinout Aug 08 '13 at 15:11
  • This is really one of the better questions here. But I disagree with your statement "this program has 100% chance to finish". Even assuming infinite stack size, you have a non-zero chance of not finishing after n steps for an arbitrary but fixed value of n. – Sentry Aug 08 '13 at 17:27
  • @Sentry: As proved [here](http://math.stackexchange.com/questions/462824/), the probability of the program eventually terminating is 100%. However, the "average" stack-depth of this program is infinite, meaning no matter how large we make the stack *(even if it uses all the memory in the world)*, the program could very easily use it all up. There's an easy way to mitigate this.. maybe I'll post it as an answer. – BlueRaja - Danny Pflughoeft Aug 09 '13 at 16:17
  • @BlueRaja-DannyPflughoeft Thanks for the link to the prove. Seems legit, but somehow my practical understanding refuses to accept it ;) As someone else already said: "eventually" can be a long time ;) – Sentry Aug 09 '13 at 19:14
  • Possible duplicate of [What is a StackOverflowError?](http://stackoverflow.com/questions/214741/what-is-a-stackoverflowerror) – Raedwald Jan 22 '16 at 13:26

4 Answers4

14

Whenever a function is called or a non-static variable is created, the stack is used to place and reserve space for it.

Now, it seems that you are recursively calling the loop function. This places the arguments in the stack, along with the code segment and the return address. This means that a lot of information is being placed on the stack.

However, the stack is limited. The CPU has built-in mechanics that protect against issues where data is pushed into the stack, and eventually override the code itself (as the stack grows down). This is called a General Protection Fault. When that general protection fault happens, the OS notifies the currently running task. Thus, originating the Stackoverflow.

This seems to be happening in Math.random().

In order to handle your problem, I suggest you to increase the stack size using the -Xss option of Java.

Paulo Tomé
  • 1,910
  • 3
  • 18
  • 27
Levente Kurusa
  • 1,858
  • 14
  • 18
  • My further question was: "why does it happen every time in Math.Random?" Is it possible that the stack can accept an even number of functions, and that Math.random() is in fact always the uneven-th function added in the stack? – Fabinout Aug 08 '13 at 12:24
  • 1
    It depends on how much space is allocated for the stack by the JVM. This happening in the `Math.random()` is a pure coincidence. – Levente Kurusa Aug 08 '13 at 12:25
  • 1
    @Fabinout My bet is that `Math.random()` requires more space then `loop()` on stack (for allocating local variables, etc...), so the probability of it happening there is higher, and might even be 1 if it requires more then double the space. Just a guess though. – amit Aug 08 '13 at 12:33
  • @LeventeKurusa Yes, with Java -Xss4m, I could ran this 10.000 times!! Thanks a lot ;) – Fabinout Aug 08 '13 at 12:35
  • @Fabinout I am happy that I could help! :) – Levente Kurusa Aug 08 '13 at 12:37
  • 3
    Haha, I am not here for the reputation. I enjoy to help others :p – Levente Kurusa Aug 08 '13 at 12:40
  • I suspect that Math.random() will always be the one which fails. This is because it is called first. If it succeed, the stack can grow by one so it won't fail later. – Peter Lawrey Aug 08 '13 at 13:32
  • @LeventeKurusa I just realised that my program is wrong. When I use " return loop(depth + 1) && loop(depth + 1);", it doesn't evaluate the second item if the first one is false. I need to rewrite it. – Fabinout Aug 08 '13 at 14:55
  • @Fabinout: The function doesn't ever return false, so that's a non-issue. Though, it does mean there's no reason to even return a value; better would be something like `if(prob()) { return; } else { loop(depth + 1); loop(depth+1); }` – BlueRaja - Danny Pflughoeft Aug 09 '13 at 16:21
4

As you said, the loop function recursively calls itself. Now, tail recursive calls can be rewritten to loops by the compiler, and not occupy any stack space (this is called the tail call optimization, TCO). Unfortunately, java compiler does not do that. And also your loop is not tail-recursive. Your options here are:

  1. Increase the stack size, as suggested by the other answers. Note that this will just defer the problem further in time: no matter how large your stack is, its size is still finite. You just need a longer chain of recursive calls to break out of the space limit.
  2. Rewrite the function in terms of loops
  3. Use a language, which has a compiler that performs TCO
    1. You will still need to rewrite the function to be tail-recursive
    2. Or rewrite it with trampolines (only minor changes are needed). A good paper, explaining trampolines and generalizing them further is called "Stackless Scala with Free Monads".

To illustrate the point in 3.2, here's how the rewritten function would look like:

def loop(depth: Int): Trampoline[Boolean] = {
  numberOfPrograms = numberOfPrograms + 1
  if(depth > bestDepth) {
    bestDepth = depth
  }
  if(proba()) done(true)
  else for {
    r1 <- loop(depth + 1)
    r2 <- loop(depth + 1)
  } yield r1 && r2
}

And initial call would be loop(0).run.

Laurel
  • 5,965
  • 14
  • 31
  • 57
George
  • 8,368
  • 12
  • 65
  • 106
2

Increasing the stack-size is a nice temporary fix. However, as proved by this post, though the loop() function is guaranteed to return eventually, the average stack-depth required by loop() is infinite. Thus, no matter how much you increase the stack by, your program will eventually run out of memory and crash.

There is nothing we can do to prevent this for certain; we always need to encode the stack in memory somehow, and we'll never have infinite memory. However, there is a way to reduce the amount of memory you're using by about 2 orders of magnitude. This should give your program a significantly higher chance of returning, rather than crashing.

We can do this by noticing that, at each layer in the stack, there's really only one piece of information we need to run your program: the piece that tells us if we need to call loop() again or not after returning. Thus, we can emulate the recursion using a stack of bits. Each emulated stack-frame will require only one bit of memory (right now it requires 64-96 times that, depending on whether you're running in 32- or 64-bit).

The code would look something like this (though I don't have a Java compiler right now so I can't test it):

static int bestDepth = 0;
static int numLoopCalls = 0;

public void emulateLoop() {
    //Our fake stack.  We'll push a 1 when this point on the stack needs a second call to loop() made yet, a 0 if it doesn't
    BitSet fakeStack = new BitSet();
    long currentDepth = 0;
    numLoopCalls = 0;

    while(currentDepth >= 0)
    {
        numLoopCalls++;

        if(proba()) {
            //"return" from the current function, going up the callstack until we hit a point that we need to "call loop()"" a second time
            fakeStack.clear(currentDepth);
            while(!fakeStack.get(currentDepth))
            {
                currentDepth--;
                if(currentDepth < 0)
                {
                    return;
                }
            }

            //At this point, we've hit a point where loop() needs to be called a second time.
            //Mark it as called, and call it
            fakeStack.clear(currentDepth);
            currentDepth++;
        }
        else {
            //Need to call loop() twice, so we push a 1 and continue the while-loop
            fakeStack.set(currentDepth);
            currentDepth++;
            if(currentDepth > bestDepth)
            {
                bestDepth = currentDepth;
            }
        }
    }
}

This will probably be slightly slower, but it will use about 1/100th the memory. Note that the BitSet is stored on the heap, so there is no longer any need to increase the stack-size to run this. If anything, you'll want to increase the heap-size.

Community
  • 1
  • 1
BlueRaja - Danny Pflughoeft
  • 84,206
  • 33
  • 197
  • 283
0

The downside of recursion is that it starts filling up your stack which will eventually cause a stack overflow if your recursion is too deep. If you want to ensure that the test ends you can increase your stack size using the answers given in the follow Stackoverflow thread:

How to increase to Java stack size?

Community
  • 1
  • 1
JeroenWarmerdam
  • 412
  • 2
  • 9