1

I got a problem with the memory usage of my java Application. With both Heap Space and Non Heap Space. For now I concentrate on my Heap Space.

My Application is a SocketServer, which gets Input via DataInputStream. I'm reading in the information as byte array. I got an irregular amount of Input each second, but we are speaking about a space from 400 byte to 1.000 byte per second, peaks can go higher.

As my programm is a Server it waits in an endless loop for input. Now I have the problem, that my Heap Space is climbing up over time, all 5-10 minutes it rises by 0,5MB.

I used multiple monitor apps, like jconsole and YourProfiler. After that I tried to figure out with the help of Heap Dumps, which I gnerate with jmap and analyse with Eclipse Memory Analyzer.

Now my question is, in this example code, which Option is better or rather uses less Heap Space 1 or 2?

Option 1:

while (true){
byte [] one= new byte [21]; 
do something with one;
byte [] two= new byte [50];
do something with two;
byte [] three= new byte [30];
do something with three;
}

Option 2:

byte [] one;
byte [] two;
byte [] three;
while (true){
one= new byte [21]; 
do something with one;
two= new byte [50];
do something with two;
three= new byte [30];
do something with three;
}

I don't know what happens to the three objects created in the loop. These should be local variables and only visible and accessible in the loop. But after one loop circle, the JVM would delete them and creates new one in the next circle. So there should be no memory leak, I guess?

In the second Option, the three variables are declared outside the loop, so that they will be alive for the whole time. In the loop the references of these objects changes, so there would be no reference to the old content, which means it gets deleted, respectively collected by the GC.

In both Option there will be about 4 circles for each second.

Thank you in advance for your help!

Result of JUnit Test:

enter image description here

Lennie
  • 253
  • 6
  • 16
  • This is a little [XY Problem](http://meta.stackexchange.com/questions/66377/what-is-the-xy-problem)-ish. Is there any reason to believe that the different declaration locations will make any difference to heap usage? Have you tried measuring the difference? Why do you think one of your "option 1" and "option 2" will be the solution to your problem? – davmac Jul 15 '16 at 08:26
  • Hi, my reason to believe in different heap usage in the two options is that I think the collection from the GC is different between the options. I monitored both options, but could find a significant difference..but the problem is that the conditions in which my application run aren't always the same. Because in every monitoring run, the counter of objects is rising in the heap dump file. And that is the only endless loop in my Code and the only creating of new Objects/References. Will try to analyse my code more deeply to point out another position, where a memory leak could happen. – Lennie Jul 15 '16 at 08:35
  • Hi @davmac I have actually measured the difference. Please see the `JUnit` test cases of my answer. – Sanjeev Saha Jul 15 '16 at 08:49
  • "I think the collection from the GC is different between the options" - why? In what way? If you explain in your question how you think GC is affected by the options, you would get better quality answers. It may be that what you _think_ is not correct, or at least, is not significant to the problem you have posed in your question. Please read the XY Problem link I posted above. I personally do not believe there would be any significant difference between the options, and your measurements support my view. – davmac Jul 15 '16 at 09:21
  • Also, you talk about a memory leak, but there is no evidence of a memory leak. A growing heap is a natural state for a Java program. Objects that are no longer referenced are not immediately cleared from the heap - that requires garbage collection, and garbage collection is more efficient when the heap is allowed to grow. Growing heap != memory leak. If you want to be sure, put a `System.gc()` call inside your loop and watch the heap size then. (But don't do this in production code). – davmac Jul 15 '16 at 09:24
  • Thanks for the answer, I monitored the program over nearly 24 hours, and the heap size was still growing, very slow, yes, but still. I had System.gc() in my code, that doesn't help and I read that one should not use it, cause the JVM can cancel the request or does not perform the GC. – Lennie Jul 15 '16 at 09:32
  • You are most likely right with the XY problem, but I can't post all the code of the application here, my actual problem is the growing of heap and non heap space. I let my application run with Andy Turners solution and monitor it. If it doesn't help I have to search for other code segemnts, which could cause heap growth. – Lennie Jul 15 '16 at 09:33
  • Hi @Lennie you must have noticed the performance degradation in the case where variable was declared outside the loop in `JUnit Test`. Though people often try to justify that both `Option 1` are `Option 2` are equivalent in terms of memory usage but the results of `JUnit Test` proves that declaring variable inside the loop is far better in this regard. Please look at the top two answers at: http://stackoverflow.com/questions/407255/difference-between-declaring-variables-before-or-in-loop Which of the two options have you selected to use? – Sanjeev Saha Jul 16 '16 at 11:38

3 Answers3

5

The variables one, two and three are just references: they don't hold the value themselves, but instead just refer to places in the heap where the actual array object is stored.

As such, there is no difference between the two approaches in terms of the number of objects allocated.

Option 3: allocate the arrays outside the loop, and reuse the same arrays:

byte [] one= new byte [21];
byte [] two= new byte [50];
byte [] three= new byte [30];
while (true){
  // If necessary, zero out the arrays so that data from the previous
  // iteration is not used accidentally.
  Arrays.fill(one, (byte) 0);
  Arrays.fill(two, (byte) 0);
  Arrays.fill(three, (byte) 0);

  // Rest of the loop.
}

This allocates the arrays up-front, so only 3 array objects are created, rather than (3 * #iterations) array objects.

Note that you can only use this approach if you don't leak references to the arrays, e.g. put them into a list which exists outside the body of the loop.


To demonstrate that the memory allocation is identical in the OP's two approaches, try decompiling the code:

  public static void inLoop() {
    while (true) {
      byte[] one = new byte[21];
      byte[] two = new byte[50];
      byte[] three = new byte[30];
    }
  }

  public static void outsideLoop() {
    byte[] one;
    byte[] two;
    byte[] three;
    while (true) {
      one = new byte[21];
      two = new byte[50];
      three = new byte[30];
    }
  }

These two methods decompile to identical bytecode:

  public static void inLoop();
    Code:
       0: bipush        21
       2: newarray       byte
       4: astore_0
       5: bipush        50
       7: newarray       byte
       9: astore_1
      10: bipush        30
      12: newarray       byte
      14: astore_2
      15: goto          0

  public static void outsideLoop();
    Code:
       0: bipush        21
       2: newarray       byte
       4: astore_0
       5: bipush        50
       7: newarray       byte
       9: astore_1
      10: bipush        30
      12: newarray       byte
      14: astore_2
      15: goto          0

As such, the runtime memory allocation must be the same.

Andy Turner
  • 137,514
  • 11
  • 162
  • 243
  • Thanks for the answer, that means in both my options there will be 3*iterations array objects created? Wouldn't the "old" references be collected by the GC? Will try your solution directly! – Lennie Jul 15 '16 at 08:11
  • @Lennie yes: "there is no difference between the two approaches in terms of the number of objects allocated". – Andy Turner Jul 15 '16 at 08:11
  • 1
    @Lennie "Wouldn't the "old" references be collected by the GC?" They are, eventually, when the GC happens to run. If you can avoid allocating extra objects, though, you don't need to rely on the GC to run at any particular time. – Andy Turner Jul 15 '16 at 08:12
  • Ah, okay thanks again for your reply and answer. The tip to avoid relying on the GC is very helpful! – Lennie Jul 15 '16 at 08:37
  • It's true that the bytecode for the two variants is identical, but the live range table (which is stored separately to the bytecode, within the .class file) will record different live ranges between the two code forms. This can result in difference in execution: specifically the values in the variables in "option 2" can be garbage collected one iteration earlier. In the long run, this is going to make very little difference, unless the allocations in the loop are huge. – davmac Jul 15 '16 at 10:52
0

Let us consider Option 2:

For the second iteration consider the following:

one= new byte [21];

When the array object new byte [21] is created and yet not assigned to one there would be two objects in the memory:

  1. the object which was created in the first iteration and assigned to one. This object is still in scope and therefore not eligible for GC.
  2. the object which is just created but yet to be assigned to one. This object is also in scope and not eligible for GC.

So memory usage in Option 2 will be more than Option 1 where under similar scenario the object which was created in the first iteration would be out of scope and will be eligible for garbage collection.

So Option 1 is better in terms of heap space usage!

Following is the program which proves my point:

import org.junit.Assert;
import org.junit.Test;

public class VariableDeclarationTest {

    // this value may be increased or decreased as per system
    private static final int ARRAY_SIZE = 5400;

    @Test
    public void testDeclareVariableInsideLoop() {
        System.out.println("\n--------testDeclareVariableInsideLoop --------");

        boolean successFlag = false;
        for (int i = 1; i <= 3; i++) {
            System.out.println("iteration: " + i);
            Integer[][] arr = getLargeArray(ARRAY_SIZE); // declare inside loop
            System.out.println("Got an array of size: " + arr.length);
            successFlag = true;

        }
        Assert.assertEquals(true, successFlag);

    }

    @Test(expected = OutOfMemoryError.class)
    public void testDeclareVariableOutsideLoop() {
        System.out.println("\n---------testDeclareVariableOutsideLoop --------");
        Integer[][] arr = null; // declare outside loop
        for (int i = 1; i <= 3; i++) {
            System.out.println("iteration: " + i);
            arr = getLargeArray(ARRAY_SIZE);
            System.out.println("Got an array of size: " + arr.length);

        }

    }

    private Integer[][] getLargeArray(int size) {
        System.out.print("starts producing array....");
        Integer[][] arr = new Integer[size][size];
        for (int i = 0; i < arr.length; i++) {
            for (int j = 0; j < arr[i].length; j++) {
                arr[i][j] = size;
            }
        }
        System.out.println(" completed");
        return arr;
    }

}

The value of ARRAY_SIZE may be increased and decreased as per the configuration and load of the system. At one point it will be seen that method where variable is declared outside the loop throws OutOfMemoryError but the method where variable is declared inside does not.

Sanjeev Saha
  • 2,632
  • 1
  • 12
  • 19
  • This is simply incorrect. The memory allocation is identical. See the edit to my answer. – Andy Turner Jul 15 '16 at 08:18
  • H @Lennie could you please run the `JUnit` test cases as described in my answer and see for yourself which of two options is better? – Sanjeev Saha Jul 15 '16 at 08:39
  • Hi @AndyTurner could you please run the `JUnit` test cases as described in my answer and see for yourself which of two options is better? – Sanjeev Saha Jul 15 '16 at 08:44
  • the reason for the OOM in the second test is most likely that `testDeclareVariableInsideLoop` executes first, and its arrays aren't GC'd by the time `testDeclareVariableOutsideLoop` is executed. Try decompiling it, like I demonstrate, and observe that the bytecode is the same for the two test cases, so there literally can be no way that they perform differently in terms of memory allocation. – Andy Turner Jul 15 '16 at 08:48
  • Hi @AndyTurner please try running such that `testDeclareVariableOutsideLoop` executes first and `testDeclareVariableInsideLoop() ` second. You will be find that result is same. I have just executed it on my system `testDeclareVariableOutsideLoop` runs out of memory even it is executed first. – Sanjeev Saha Jul 15 '16 at 08:53
  • @Sanjeev Saha, I run the JUnit Test, but where do I see the different Heap Size? I got no OutOfMemoryError. – Lennie Jul 15 '16 at 08:59
  • It is true that declaring the variable outside the loop means that the allocation inside the loop occurs with the previous object still allocated. However, this has very little to do with the question, since as soon as the assignment completes the original object is eligible for garbage collection, and therefore this minor detail does not make any difference to heap _growth over time_ which is what the question is about. – davmac Jul 15 '16 at 09:18
  • 1
    @Sanjeev Saha, I run it with a bigger Array size, and you were right I get the OutOfMemoryError. So that means if I declare the Object inside the loop the "old" reference can be faster collected by the GC? – Lennie Jul 15 '16 at 09:57
-1

Short answer is : Option 1 because of it's initilization way.

Mohammadreza Khatami
  • 1,444
  • 2
  • 13
  • 27