0

Reading this post, I found that adding -XX:MaxJavaStackTraceDepth=1000000 to the JVM args will allow me to see all frames of a stack trace, which can be handy in stack overflow situations. But I'm trying to understand why it is limited to 1024 by default, and is there any danger to unlimiting it? I understand that it might use a lot of heap if there really is a StackOverflow. But given heap sizes these days, how much of a danger is that, really? Are there other concerns? There are hints that it has some performance impact, but unclear if that is true.

UPDATE: As per comment, a reasonable stack depth is far less than that. In practice I see StackOverflow at about 18000. So maybe the right limit is 100000? But the same question.

Wheezil
  • 3,157
  • 1
  • 23
  • 36
  • In general I don't think you need a depth of a million to find a recursion issue. A million seems excessive *(and would be annoying in SO errors)*. Seems like the shorter the better, with some tuning based on your runtime environment. E.g., 1K is plenty for trivial apps. Something running in Giant Big Environment might need 2x or 8x or more. – Dave Newton Jul 11 '23 at 19:18
  • This options is not for the size of the STACK, it sets the max number of frames that will be included in the stack trace from a exception, which defaults to 1024. You are right that practically speaking you won't get more than about 20k frames with the default JVM stack of 1MB. So this limit could be, say, 100k. But the question is the same: does this option have negative impacts? – Wheezil Jul 11 '23 at 19:50
  • Well sure it does, if you truly have an error being thrown at all regularly, which is 20k frames deep, and happen to log it anywhere. In that case you're going to be IO bound for quite a bit, which can definitely hurt something that was as innocent as reporting an exception. Stack traces will also tend to remove excessive/duplicate lines between chained exceptions anyhow. – Rogue Jul 11 '23 at 19:57
  • End-all point: if you're in production, why pollute the system with the noise? If you've already noticed the defect (and can reproduce it), then tick it on if you need in development in order to go down the rabbit hole. – Rogue Jul 11 '23 at 19:58
  • Good point about the size/speed of logging. Certainly argues for an option. But there are cases we cannot reproduce in the lab, must be available in production. Like... some RDBMS driver was losing its mind and throwing StackOverflow. Hard to figure out why without being able to log at least the top and bottom 100 of the stack. Which is why we are here. Was asked by support team to offer better error logs for the rare stackoverflow event. – Wheezil Jul 11 '23 at 20:06
  • Sensitive areas like a DB driver would be chief places to avoid such logs. It's a bit like a tumble dryer: it doesn't matter how you got to a SO, because once you're there it'll just keep spinning. The solution is to prevent that area from spinning, and perhaps dump relevant debug information if a SO occurs (you can catch those! Just rethrow for sanity's sake). – Rogue Jul 11 '23 at 20:19

1 Answers1

1

I wrote a little test to see how many try/throw/catch could be done in a loop. I was surprised to find that the time spent is proportional to the stack depth at which the exception is thrown, even if there are no frames between the try{} and throw. This makes sense, because the Throwable frames array is being populated at the point of the throw, up to the limit. Which leads to: yes, this option DOES have a performance impact (in addition to logging, as noted by Rogue), but only when the stack depth at the point of throw is higher than XX:MaxJavaStackTraceDepth.

Test code:

public class Test1 {
  static int DEPTH = 10;
  static void test(int depth) {
    if (depth < DEPTH) {
      // Wait until we get to the desired depth
      test(depth+1);
    } else {
      int ITER = 1000000 / DEPTH;
      long start = System.currentTimeMillis();
      for (int i = 0; i < ITER; i++) {
        try {
          throw new RuntimeException();
        } catch (Exception ex) {}
      }
      long end = System.currentTimeMillis();
      System.out.println("DEPTH=" + DEPTH + ": throws/sec=" + ITER / ((end - start) * 0.001f));
    }
  }
  public static void main(String[] args) throws Exception {
    for (DEPTH = 1; DEPTH <= 10000; DEPTH *= 2) {
      test(0);
    }
  }
}

Run with default

$ java -jar target/*shad*jar
DEPTH=1: throws/sec=1182033.1
DEPTH=2: throws/sec=1184834.1
DEPTH=4: throws/sec=1068376.0
DEPTH=8: throws/sec=833333.3
DEPTH=16: throws/sec=578703.7
DEPTH=32: throws/sec=336021.5
DEPTH=64: throws/sec=195312.48
DEPTH=128: throws/sec=91905.88
DEPTH=256: throws/sec=48824.996
DEPTH=512: throws/sec=25697.367
DEPTH=1024: throws/sec=13369.862
DEPTH=2048: throws/sec=12842.1045
DEPTH=4096: throws/sec=12199.999
DEPTH=8192: throws/sec=12199.999

Run with larger stack trace:

$java -XX:MaxJavaStackTraceDepth=1000000 -jar target/*shad*jar
DEPTH=1: throws/sec=1166861.1
DEPTH=2: throws/sec=1157407.4
DEPTH=4: throws/sec=1050420.1
DEPTH=8: throws/sec=816993.4
DEPTH=16: throws/sec=578703.7
DEPTH=32: throws/sec=322164.94
DEPTH=64: throws/sec=190548.78
DEPTH=128: throws/sec=87775.28
DEPTH=256: throws/sec=50076.92
DEPTH=512: throws/sec=26039.998
DEPTH=1024: throws/sec=13189.189
DEPTH=2048: throws/sec=6777.7773
DEPTH=4096: throws/sec=3388.8887
DEPTH=8192: throws/sec=1718.3098

So you can see, with the default trace limit of 1024, the performance levels off around 12K/sec above the limit of 1024. But when you unlimit the trace size, the performance continues to decline.

Thanks to Rogue for suggesting the logging impact.

Side thought: as often stated, never use exceptions for normal flow control! Because you have no control over performance, depending on the stack depth when your code is called.

Wheezil
  • 3,157
  • 1
  • 23
  • 36
  • A trick that may or may not be useful to you: In a custom exception you can override the `fillInStackTrace` method to do nothing. In that case, no time will be taken to populate the stack frames, because they won't be populated. But it will be much less useful if logged. – David Conrad Jul 11 '23 at 23:37