1

I have such situation: I read lines from CSV-file and put them to List<String>. After finishing, lines are parsed according to special logic and their parts are put as keys into several HashMap<String, Integer>. Then list records is cleared. Actually I tried several ways:

records.clear();
records = null;
records = new ArrayList<String>();

But it seems that memory is not released anyway (checked it by using profiler and simple print to console). Due to such iteration with reading of file and further parsing is repeated several times, at one moment I get an OutOfMemoryError.

Could anybody suggest any solution here? Is it possible with Java to solve it out? Or pool of string is not negligible for Garbage Collector? Maybe other languages like C++ are more suitable?

Thank you.

Rudi
  • 19,366
  • 3
  • 55
  • 77
  • 5
    We need more code in order to be able to tell you what is happening... – Menelaos Jun 05 '13 at 17:02
  • 1
    Do you clear the hashmaps ? – Denys Séguret Jun 05 '13 at 17:02
  • Do you have millions of lines? – AlexWien Jun 05 '13 at 17:03
  • Which *exact* version of Java are you using? It can make a big difference. How big is the file, and how much memory do you have? – Jon Skeet Jun 05 '13 at 17:03
  • Make sure that **all** the references to your Strings are going away. If they're still reachable from *anywhere*, they won't be collected. If you have tons and tons of lines, and you're storing those Strings in a whole pile of different HashMaps, you might just be out of memory, plain and simple. Try to determine (using your profiler) what objects are holding references to all those Strings (or other things!) – Henry Keiter Jun 05 '13 at 17:05
  • Can you show us how you add the Strings to the Hashmap? – Clark Kent Jun 05 '13 at 17:15

6 Answers6

3

You said:

After finishing, lines are parsed according to special logic and their parts are put as keys into several HashMap.

If you're getting those parts via something like String.substring, that substring isn't a new copy, it's actually pointing at the original string with knowledge of the begin and indexes that comprise the substring.

Consequently, the original string isn't garbage collected as long as any of those substrings exist. Clearing your collection won't help if those substrings were passed on to other parts of the system.

You'd need to make sure you created a completely new string, e.g:

new String( myString.substring( 1, 5 ) );

Here's a link that looks decent (Googled "String substring points at original"). http://javarevisited.blogspot.com/2011/10/how-substring-in-java-works.html

Though apparently later JDK 1.7 releases have fixed this according to this: how the subString() function of string class works

Community
  • 1
  • 1
Chris Kessel
  • 5,583
  • 4
  • 36
  • 55
1

We need more code to be able to see if you have a "memory leak" somewhere.

Have you considered storing less rows in your list instead of reading the whole file within the list? Additionally, you could try doing away with intermediate structures all together.

  • Read 100 rows and add these to list
  • Iterate through, parse and add to hashmaps.
  • Clear list

You can increase the heap size, but if you don't find the leak this can lead to another exception if you encounter a very large file size. Good that dystroy pointed this out.

Instructions to increasing heap are at: Increase heap size in Java

Example: java -Xmx6g myprogram

Community
  • 1
  • 1
Menelaos
  • 23,508
  • 18
  • 90
  • 155
  • 5
    Increasing the heap size instead of looking for the problem looks like a bad reflex. – Denys Séguret Jun 05 '13 at 17:04
  • This is correct! Good that you mentioned. We need more code from the OP to be able to see if indeed there is a memory leak. It may be the case however that the user's problem will be solved by increasing the heap. Also, maybe reading less rows at a time (breaking up the problem to a smaller one) will also work. – Menelaos Jun 05 '13 at 17:06
1

GC in java works well. If you get OutOfMemoryError you probably either have memory leak (i.e. you store too much in your collections) or you gave not enough heap for your application.

I believe that in your case you do not arrive to code that clears collection. You probably fail during the parsing. In this case first try to add some more memory to your java process using command line option -Xmx, e.g. -Xmx1024M (1GB).

I believe you will be able to find the option that helps your parsing to be finished successfully.

Then, if you are working on utility that parses files once and terminates, you are done. If however your application should run and a parse more and more files check whether the memory usage is not growing after processing of each file. If it is growing check whether it is by design or is caused by bug.

If it is by design, think about re-design. BTW, do you really have to read all lines into memory and then process them? What kind of processing are you doing? Is there a chance that you can process your file line-by-line and dramatically decrease your memory usage?

AlexR
  • 114,158
  • 16
  • 130
  • 208
1

If you fill those hashmaps with substrings of the lines in your records List, you are actually storing those lines entirely for each of those substrings.

Have a look at: Memory leak traps in the Java Standard API

The answer in this case would be to use something like:

String key = new String(record.substring(6,12));

or

String key = record.substring(6,12).intern();
Community
  • 1
  • 1
ljgw
  • 2,751
  • 1
  • 20
  • 39
1

It may be that you have enough memory, but the memory is fragmented. How you build your ArrayList and HashMap is critical. E.g. are using StringBuilder?

Unless the entire code up to the error is shown, it is very hard to debug a memory problem remotely.

Also, it helps if we know the java version, environment, etc.

Also, do not forget that if you have a lot of objects with different sizes, the memory gets fragmented easier. And if the memory is barely enough to contain those objects, you can get memory errors.

Finally, you can initiate a garbage collection of your own ( - and most probably the JVM will know better :-) ).

TFuto
  • 1,361
  • 15
  • 33
0

Garbage Collector works only when you loose all references to Object. You say that some information is stored in HashMap, so Garbage Collector dont temove them.

Anonim
  • 1