-2

I have written a code. The problem that I am facing is that when the "j" of for loop exceeds 1000 I start to get an error of "GC overhead limit exceeded". If I increase the allocated memory to 4GB I can iterate upto 2000 after which the same problem occurs. I want to keep this option of increasing the memory as the last resort and want to try to scale my code. The compiler highlights a problem with the statements where I have placed an arrow. Can someone please guide me that what could be the possible error here. I have already visited this question Error java.lang.OutOfMemoryError: GC overhead limit exceeded

     for (int j=1; j<=num_doc; j++) { 
        List<Integer> list1 = new ArrayList<Integer>(Collections.nCopies(129039, 0));
        BufferedReader fl = new BufferedReader(new FileReader(dataFolder+"file"+ " ("+j+")"+".int"));
        String line1;

        while((line1=fl.readLine()) != null) {

            String[] arr=line1.split(" ");//<---------------------
            line1="";
            int k = Integer.parseInt(arr[0]);
            Arrays.fill(arr, "");
            numb=numb+1;
            int temp=(list1.get(k))+1;
            list1.set(k, temp);
        }
        F_d.add(numb);
        numb=0;
        fl.close();
        ls2d.add(new ArrayList<Integer>(list1));//<---------------------
        list1.clear();
    }
Community
  • 1
  • 1
Shahzaib
  • 127
  • 1
  • 3
  • 14
  • Do you clear `ls2d` somewhere in your code? – Nicolas Filotto Sep 26 '16 at 15:13
  • Yes ls2d is used. I have pasted only that portion of the code that the compiler is highlighting. – Shahzaib Sep 26 '16 at 15:15
  • `list1` is already quite big, so if you spend your time to add it to `ls2d` without clearing it, it will quickly take a lot of memory which leads to OOME – Nicolas Filotto Sep 26 '16 at 15:16
  • Its hard to understand what you really want to achieve here, but I'm almost 100% certain that there are easier ways to achieve your goal, with a fraction of the memory. – fvu Sep 26 '16 at 15:16
  • @NicolasFilotto list1.clear is already there – Shahzaib Sep 26 '16 at 15:18
  • I'm talking about clearing `ls2d`not `list1` – Nicolas Filotto Sep 26 '16 at 15:19
  • `String[] arr=line1.split(" ");` and all the code after that seems to indicate the you only need the first element, which should be interpreted as an `int`. Confirm, deny, comment? – Adrian Colomitchi Sep 26 '16 at 15:19
  • to save some memory you could use an array of the primitive type int instead of `List` for `list1` – Nicolas Filotto Sep 26 '16 at 15:20
  • @AdrianColomitchi True – Shahzaib Sep 26 '16 at 15:21
  • Seems like you want to count occurrences of the integer in column 1 of each line into some memory structure, and that the maximum integer you expect is 129039. Especially because the number is fixed, a standard array would be much easier to work with, and much less memory intensive. – fvu Sep 26 '16 at 15:24
  • I am working on encryption, so due to the hashes, inverses and AES I had to implement it this way. But you have interpret this code snippet correctly. – Shahzaib Sep 26 '16 at 15:27
  • @NicolasFilotto J represents number of documents. Each list1 corresponds to a document and I want to maintain it in a tabular form in ls2d. So I won't be able to clear it. – Shahzaib Sep 26 '16 at 15:32

2 Answers2

0

Two things can be immediately optimized for less memory requirements:

        // we don't need all the fragments, taking only the first is fine
        String firstElem=line1.substring(0, line1.indexOf(" "));
        line1=null;// let GC collect this at its convenience
        int k = Integer.parseInt(firstElem);

then

    // Don't add the copy, add list1 itself
    // You are initing list1 in the beginning of the for cycle anyway
    // and until then nothing happens.
    ls2d.add(list1);//<---------------------
    // list1.clear(); -- since we added list1, we don't clear it anymore
Adrian Colomitchi
  • 3,974
  • 1
  • 14
  • 23
0

Here's a couple of ideas to reduce memory consumption - and runtime probably.

  • use an array instead of an ArrayList - you don't seem to use ArrayList specific functionality so an array will be more compact and easier to work with
  • tell split to just read the first field of a line

Note that I removed all code that was intended to coerce the garbage collector into cleaning up, I don't think it helps in this case.


 for (int j=1; j<=num_doc; j++) { 
    int[] list1 = new int[129039];

    BufferedReader fl = new BufferedReader(new FileReader(dataFolder+"file"+ " ("+j+")"+".int"));
    String line1;

    while((line1=fl.readLine()) != null) {
        String[] arr=line1.split(" ",2); // Just read first field - see String Javadoc
        int k = Integer.parseInt(arr[0]);
        list[k]=list[k]+1;
        numb=numb+1;
    }
    F_d.add(numb);
    numb=0;
    fl.close();
    ls2d.add(list1);// you'll obviously need to change ls2d's type, or reconvert using Arrays.asList
}
fvu
  • 32,488
  • 6
  • 61
  • 79