0

After a long research , I got to know that String is immutable .String Buffer is more efficient than String if the program involves many computations. But my question is slightly different from these

I have a function to which I pass a string . The string is actually the text of an article (nearly 3000-5000 charcs) .The function is implemented in threads. I mean to say , there is multiple call of function with different String text each time ..The later stage computations in the functions are too vast . Now when I run my code for a large number of threads, I am getting an error saying : GC Overhead Limit Exceeded . .

Now that I cant reduce the computations in the later stage of functions , my question is will it really help if I change the text type from String to String buffer? Also ,I don’t do any concatenation operation on the text string .

I have posted a small snipet of my code :

public static List<Thread> thread_starter(List<Thread> threads,String filename,ArrayList<String> prop,Logger L,Logger L1,int seq_no)
{   String text="";
    if(prop.get(7).matches("txt"))          
        text=read_contents.read_from_txt(filename,L,L1);
    else if(prop.get(7).matches("xml"))
        text=read_contents.read_from_xml(filename,L,L1);
    else if(prop.get(7).matches("html"))
        text=read_contents.read_from_html(filename,L,L1);
    else
    {
        System.out.println("not a valid config");
        L1.info("Error : config file not properly defined for i/p file type");

    }

    /*TODO */
    //System.out.println(text);
    /*TODO CHANGES TO BE DONE HERE */
    if(text.length()>0)
    {
    Runnable task = new MyRunnable(text,filename,prop,filename,L,L1,seq_no);
     Thread worker = new Thread(task);  
     worker.start();
      // Remember the thread for later usage
     threads.add(worker);
    }
    else
    {
        main_entry_class.file_mover(filename, false);
    }
    return threads;

}

And i'm calling the above function repeatedly using the following code :

List<Thread> threads = new ArrayList<Thread>();
thread_count=10;
int file_pointer=0;// INTEGER POINTER VARIABLE
do
{
            if(file.size()<=file_pointer)
                break;
            else
            {   String file_name=file.get(file_pointer);        
                threads=thread_starter(threads,file_name,prop,L,L1,seq_no);     
                file_pointer++;
                seq_no++;
            }       
}while(check_status(threads,thread_count)==true);

And the check status function :

public static boolean check_status(List<Thread> threads,int thread_count)
{
    int running = 0;
    boolean flag=false;
    do {
       running = 0;        
       for (Thread thread : threads) {            
         if (thread.isAlive()) {
             //ThreadMXBean thMxB = ManagementFactory.getThreadMXBean();
             //System.out.println(thMxB.getCurrentThreadCpuTime());
           running++;
         }
       } 
       if(Thread.activeCount()-1<thread_count)
       {
           flag=true;
           break;
       }           
    } while (running > 0);
    return flag;

}
kiran
  • 339
  • 4
  • 18
  • 1
    What do you mean by "computations"? – Dawood ibn Kareem Mar 12 '14 at 05:34
  • You're essentially running out of memory to run the process smoothly.Find is there any memory leakage or increase jvm heap size – Kick Mar 12 '14 at 05:35
  • @David Wallace : By computation , I actually meant performing Name Entity Recognition . The Name Entity Recognition takes huge memory . Also , Since I cant reduce anything in NER part , I just wanted to know if changing from String to String Buffer would help . – kiran Mar 12 '14 at 05:38
  • Your issue not related to using string or string buffer. It is something else. Please do not mix these two. – UVM Mar 12 '14 at 05:39

2 Answers2

0

If you are getting the error GC Overhead Limit Exceeded then you may try something in between like -Xmx512m first. Also if you have a lot of duplicate strings, you can use String.intern() on them.

You may check this doc:

-XX:+UseConcMarkSweepGC
Rahul Tripathi
  • 168,305
  • 31
  • 280
  • 331
  • OP mentioned in question 'there is multiple call of function with different String text each time' – Kick Mar 12 '14 at 05:38
  • I have already tried that .. I have a system memory of 2GB . I tried setting Xmx1g ,Xmx1500m etc, ,, but still getting same error – kiran Mar 12 '14 at 05:39
  • @kiran:- I dont think that your issue is because of String or StringBuffer but rathher I think you're essentially running out of memory to run the process smoothly. Could you share your code please? – Rahul Tripathi Mar 12 '14 at 05:41
  • @Youngistan : I meant Iam creating multiple threads to process multiple articles . And this function is a part of a thread . Hence : during each thread a new String of article text is created – kiran Mar 12 '14 at 05:41
  • So @kiran can you please paste the code.May be someone optimize the code – Kick Mar 12 '14 at 05:43
  • @kiran:- The JVM will start with memory useage at the initial heap level. If the maxheap is higher, it will grow to the maxheap size as memory requirements exceed it's current memory. Check this link:- http://javarevisited.blogspot.com/2011/11/hotspot-jvm-options-java-examples.html – Rahul Tripathi Mar 12 '14 at 05:43
  • @Youngistan I have already pasted a small snippet of code in my question . – kiran Mar 12 '14 at 06:14
  • The code you have pasted only contain 1 string and 1 thread of class MyRunnable that you adding.I dont find any multi-threading there. @kiran – Kick Mar 12 '14 at 06:22
  • @Youngistan : i have pasted the code .The multi-threading is done in the Do-While loop. – kiran Mar 12 '14 at 07:50
  • @kiran remove running = 0; declare inside do block,Also paste run method of Runnable class. – Kick Mar 12 '14 at 08:23
  • @Youngistan : Running is the variable which keeps a count of the number of threads running at a particular time . Removing running ,would disable monitoring the number of threads created at any point of time – kiran Mar 12 '14 at 10:14
0

Check out this link to know what GC Overhead Limit Exceeded error isGC overhead limit exceeded.

As the page suggests, out of memory error occurs when the program spends too much time in garbage collection. So, the problem is not with the number of computations you do...it is with the way you have implemented it. You might have a loop creating too many variables or something like that, so a string buffer might not help you.

Community
  • 1
  • 1
anirudh
  • 4,116
  • 2
  • 20
  • 35