3

I'm running a WEKA classifier (J48 with an input .arff file composed of 3 fields, field 1 has ~27k distinct attributes, field 2 ~ 500k values) in a latest generation Macbook Pro with 8GB RAM. I increased the java heap space to the maximum possible using the -Xmx parameter:

java -Xmx7G -cp weka-3-6-10/weka.jar weka.classifiers.trees.J48 -t myfiles/loc_linear.arff -i

however when I run the classifier (after about 10 minutes) I get the error "Exception in thread "main" java.lang.OutOfMemoryError: Java heap space".

Evidently 8GB RAM is not enough with my input file. Does this mean the only solution to this is having a more powerful hardware (e.g. 16GB RAM or a very powerful server/cluster)? Would there be any workaround to this issue? (e.g. reducing the input file? If so, which would be in your opinion the criteria to apply in the reduction?). Any other ideas or suggestions?

Albz
  • 1,982
  • 2
  • 21
  • 33

2 Answers2

5

If you are running the Weka GUI on a Mac OS X machine, you can edit a plist configuration file. I followed instructions from the Weka mailing list.

  1. cd into /Applications/weka-XXX.app/Contents , or wherever your weka executable was installed.

  2. There will be a file called Info.plist there. I suggest you save a copy of that file to another location, as you'll need to change it in the next step.

  3. Open the weka-XXX.app/Contents/Info.plist (XML) file in your favorite text editor and look for a block that says "VMOptions". There should be a value that says "-Xmx256M" which specifies the memory. Change that value to something bigger, like "-Xmx1024M".

  4. Start Weka.

stackoverflowuser2010
  • 38,621
  • 48
  • 169
  • 217
0

From your cited line of code it seems you are running Weka from the simple command line interface. If that is the case, then the answer is the same as this [question] (Increase heap to avoid Out of Memory Error in WEKA.)

You can't increase the heap size from the command line interface. Instead I believe you should increase the heap size in the RunWeka.ini file as stated in Weka's instructions

Community
  • 1
  • 1
Walter
  • 2,811
  • 2
  • 21
  • 23
  • 2
    Thanks,however RunWeka.ini is exclusive to Windows systems. I'm using MacOsX. I was able to increase heap throgh the command line using -Xmx. It works for me: if I check the memory usage in real time I see that when the parameter is applied it actually uses more memory and runs for longer. I also tried a different approach in the WEKA GUI for Mac OS: I edited the memory heap in the info.plist (http://tinyurl.com/q4ow2u2 ) file (the "counterpart of RunWeka.ini" in the MacOS), the behavior is the same as the command line: it really seems it requires more than 8GB RAM with my input file. – Albz Sep 29 '13 at 15:01
  • 1
    Yikes, needing more than 8GB is intense. Could you do some attribute selection with Weka? That may be a way to pair down the size of your dataset. A good measure to use for attribute selection in Weka could be GainRatioAttributeEval (Gain Ratio is what J48 uses to decide the branches of the tree). – Walter Sep 29 '13 at 15:45