0

I'm running a Java program which reads line by line from a file as big as 1.2 GB and at some point it tries to put them in a hash. After some time after calling the taxhash.put(tmpgi,tmptax) it gives me the java.lang.OutOfMemoryError error.

I tried and changed the eclipse.ini options as follow

-startup
plugins/org.eclipse.equinox.launcher_1.1.1.R36x_v20101122_1400.jar
--launcher.library
plugins/org.eclipse.equinox.launcher.win32.win32.x86_64_1.1.2.R36x_v20101222
-product
org.eclipse.epp.package.jee.product
--launcher.defaultAction
openFile
--launcher.XXMaxPermSize
512M
-showsplash
org.eclipse.platform
--launcher.XXMaxPermSize
512m
--launcher.defaultAction
openFile
-vmargs
-Dosgi.requiredJavaVersion=1.5
-Xms2048m
-Xmx3548m

and BTW I'm running my code on a 64bit win 7 with 4GB ram. and here is the code for reading from the file!

boolean readfile(String filename,int verbose){
        //this should read the inputfile and save the data in it to the gitax array.
        taxhash=new HashMap();
        int currnum=0;
        try{
            BufferedReader inread=new BufferedReader(new FileReader(filename));
            String instring;
            String[] tmparr;
            Integer tmpgi;
            Integer tmptax;
            if(verbose>0){
                while ((instring=inread.readLine())!=null){
                    currnum++;
                    instring=instring.trim();
                    tmparr=instring.split("\\s+",0);//split on one or more whitespaces
                    //now I should have two elements in this array, the gi number and the taxid
                    if(java.lang.reflect.Array.getLength(tmparr)!=2){
                        System.err.println("Error reading from "+filename+" "+java.lang.reflect.Array.getLength(tmparr)+" elements.");
                    }else{
                        try{
                            tmpgi=Integer.valueOf(tmparr[0]);
                            tmptax=Integer.valueOf(tmparr[1]);
                        }catch (NumberFormatException e){
                            System.err.println("unable to parse number from "+tmparr[0]+" "+tmparr[1]);
                            return false;
                        }
                        taxhash.put(tmpgi,tmptax);
                    }
                    if(currnum==100000){
                        System.out.print(".");
                        currnum=0;
                    }
                }
            }else{
                while ((instring=inread.readLine())!=null){
                    instring=instring.trim();
                    tmparr=instring.split("\\s+",0);//split on one or more whitespaces
                    //now I should have two elements in this array, the gi number and the taxid
                    if(java.lang.reflect.Array.getLength(tmparr)!=2){
                        System.err.println("Error reading from "+filename+" "+java.lang.reflect.Array.getLength(tmparr)+" elements.");
                    }else{
                        try{
                            tmpgi=Integer.valueOf(tmparr[0]);
                            tmptax=Integer.valueOf(tmparr[1]);
                        }catch (NumberFormatException e){
                            System.err.println("unable to parse number from "+tmparr[0]+" "+tmparr[1]);
                            return false;
                        }
                        taxhash.put(tmpgi,tmptax);
                    }
                }
            }
        }catch (IOException e){
            System.err.println("IOError in reading from "+filename);
            e.printStackTrace();
            return false;
        }
        return true;
    }// end readfile

this is the the error in more details Exception in thread "main"

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.HashMap.resize(Unknown Source)
    at java.util.HashMap.addEntry(Unknown Source)
    at java.util.HashMap.put(Unknown Source)
    at com.ali.Blammer.taxid.readfile(taxid.java:79)
    at com.ali.Blammer.taxid.readfile(taxid.java:50)
    at com.ali.Blammer.main.run(main.java:182)
    at com.ali.Blammer.blammer.main(blammer.java:36)
    at com.ali.Interface.main.main(main.java:53)
trincot
  • 317,000
  • 35
  • 244
  • 286
Ali_IT
  • 7,551
  • 8
  • 28
  • 44

2 Answers2

2

You are changing the memory options that affect the eclipse JVM, not the memory options of the program that you run.

In the Run configuration, the second tab allows setting JVM parameters for the run execution. Anyway, I doubt that you will get to put a 1.2GB of file (plus hashes and other overheads) in 512 MB

SJuan76
  • 24,532
  • 6
  • 47
  • 87
1

You need to change the size of your program not eclipse. Since you are storing 1.2 GB of text you need at least 2.4 Gb of memory but I suspect closer to 4 GB is required (with overhead)

Since you have a small machine, I suggest you process the file progressively to minimise memory consumption if you can.

BTW: You could use TIntIntHashMap which would be much smaller than HashMap (up to 4x smaller), but could still be too much for your data set.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • thanks but what do you mean by "process the file progressively to minimise memory consumption if you can." – Ali_IT May 26 '13 at 02:03
  • 1
    Instead of reading the whole file, before using the information, it is often possible to use a portion of the file and use that, and then reading some more etc. This assumes you can break it up that way. – Peter Lawrey May 26 '13 at 09:09