I have implemented a java program . This is basically a multi threaded service with fixed number of threads. Each thread takes one task at a time, create a hashSet , the size of hashset can vary from 10 to 20,000+ items in a single hashset. At end of each thread, the result is added to a shared collection List using synchronized.
The problem happens is at some point I start getting out of memory exception. Now after doing bit of research, I found that this memory exception occurs when GC is busy clearing the memory and at that point it stops the whole world to execute anything.
Please give me suggestions for how to deal with such large amount of data. Is Hashset a correct datastructure to be used? How to deal with memory exception, I mean one way is to use System.GC(), which is again not good as it will slow down the whole process. Or is it possible to dispose the "HashSet hsN" after I add it to the shared collection List?
Please let me know your thoughts and guide me for wherever I am going wrong. This service is going to deal with huge amout of data processing.
Thanks
//business object - to save the result of thread execution
public class Location{
integer taskIndex;
HashSet<Integer> hsN;
}
//task to be performed by each thread
public class MyTask implements Runnable {
MyTask(long task) {
this.task = task;
}
@Override
public void run() {
HashSet<Integer> hsN = GiveMeResult(task);//some function calling which returns a collection of integer where the size vary from 10 to 20000
synchronized (locations) {
locations.add(task,hsN);
}
}
}
public class Main {
private static final int NTHREDS = 8;
private static List<Location> locations;
public static void main(String[] args) {
ExecutorService executor = Executors.newFixedThreadPool(NTHREDS);
for (int i = 0; i < 216000; i++) {
Runnable worker = new MyTask(i);
executor.execute(worker);
}
// This will make the executor accept no new threads
// and finish all existing threads in the queue
executor.shutdown();
// Wait until all threads are finish
while (!executor.isTerminated()) {
}
System.out.println("Finished all threads");
}
}
For such implementation is JAVA a best choice or C# .net4?