1

I have to do a Monte Carlo simulation using Java. I am in the following situation:

for (int w = 0; w < n; w++) {

   for (int t = 0; t < 25; t++) {

      for (int a = 0; a < 20; a++) {

           // ...calculations...

      }
   }
}

where n tends to big really big (order of millions). Moreover, the t and a loops (and calculations inside) are INDEPENDENT of the w loop (I use the w loop to repeat the matrix calculation n times). This means, I don't really care which w is run before or after.

Is there a (possibly not complicate since I have never used parallel programming) way to split the outer for loop and do it synchronously using different threads (e.g. if I have quad-core processor to run using all of them and not just one)?

Edited after @tevemadar solution.

From what I have understood, I can do something like this:


public class MyMonteCarloClass {
  private static double[][] monteCarloSums = new double[20][25];
  Random generator = new Random();
  
  private void incrementSum() {
    for (int t = 0; t < 25; t++) {
      for (int a =0; a < 20; a++) {
        monteCarloSums[a][t] += generator.nextGaussian();
      }
    }
  }
  
  public double[][] getValue(int numberOfSim) {
    IntStream.range(0, numberOfSim).parallel().forEach(I -> incrementSum());
    return this.monteCarloSums
  }
}

Will something like this speed up with respect having three nested loops?

  • Have you tried using threads? – Gautham M Apr 03 '21 at 09:09
  • Do these help? https://stackoverflow.com/questions/29998976/why-intstream-range0-n-in-java-8-shouldnt-be-parallel and https://stackoverflow.com/questions/26838242/why-does-intstream-range0-100000-parallel-foreach-take-longer-then-normal-f and https://stackoverflow.com/questions/48754107/making-a-parallel-intstream-more-efficient-faster Try searching the Internet for ___java parallel intstream___ – Abra Apr 03 '21 at 09:26
  • Thank you all!! @abra suggestion seems to be really useful! – Yoda And Friends Apr 03 '21 at 09:57

3 Answers3

2

With IntStream you can easily rewrite a "classic" counting loop,

for(int i=0;i<10;i++) {
  System.out.print(i);
}

as

IntStream.range(0, 10).forEach(i->{
  System.out.print(i);
});

both of them will print 0123456789.
Then a stream can be processed in parallel:

IntStream.range(0, 10).parallel().forEach(i->{
  System.out.print(i);
});

and it will suddenly produce a mixed order, like 6589724310. So it ran in parallel, and you don't have to deal with threads, executors, tasks and the like.

You have to deal with a couple things though:

  • just like methods in anonymous inner classes, lambda functions can access only "effectively final" variables from the outer scope. So if you have int j=0; in front of the loop, you can't write j=1; in the loop. But you can alter object members and array items (so j.x=1; or j[0]=1; would work)
  • you mention Monte-Carlo, so it may be worth pointing out that random number generators are not a big fan of parallel access. There is a ThreadLocalRandom.current() call which gets you a random number generator per thead
  • also, you are certainly collecting your results somewhere, and as you explicitly write that the large n is not used for anything, keep in mind that multiple threads may try updating a single location of your collector array/object, which may or may not be a problem.
tevemadar
  • 12,389
  • 3
  • 21
  • 49
  • Thank you so much!!! I have another question, which follows from the "edit" sign in the question (I am not able to write it here in the comments). @tevemadar – Yoda And Friends Apr 03 '21 at 14:17
  • @YodaAndFriends the new question should rather be posted as, well, a new question. This is a fundamental idea of StackOverflow, one question per post. – tevemadar Apr 03 '21 at 15:03
0

Use a ThreadPoolExecutor for queued parallel execution of Runables. A simple ThreadPoolExecutor you can create using the Executors utility class.

Executor executor = Executors.newFixedThreadPool(10); // 10 = number of threads

for (int w = 0; w < n; w++) {

    final int w_final = w;

    executor.execute(() -> {

        for (int t = 0; t < 25; t++) {
    
            for (int a = 0; a < 20; a++) {
    
               // ...calculations...
                // w_final instead of w here
   
          }
        }
    });
}
rafd70
  • 1
  • 1
0

You want to utilize the compute capacity available for complex calculation and that is a valid scenario.

Multithreading optimizes the system resource usage and improves performance. The precious CPU time is not wasted by a blocking thread but utilized by other threads in performing required computation.

But Concurrency without correctness doesn't make any sense. for achieving the correct result you may need to synchronize the parallel computes (if those are interrelated).

Since you are new to java concurrency I will suggest you use the Executor framework.

The framework consists of three main interfaces Executor, ExecutorService, and ThreadPoolExecutor which abstracts out most of the threading complexities and provides high-level methods for executing and manging thread lifecycle.

Let's start with a very basic soln and then you can evolve it as per your requirement.

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.TimeUnit;

public class WorkDistributer {
    private static int THREAD_COUNT = 8;

    public static void main(String... args) {

        try {
            final ExecutorService executor = Executors.newFixedThreadPool(THREAD_COUNT);
            int n = Integer.MAX_VALUE;
            for (int w = 0; w < n; w++) {
                executor.execute(new Worker());
            }
            executor.awaitTermination(5, TimeUnit.MINUTES);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }

    }
}

class Worker implements Runnable {
    @Override
    public void run() {
        for (int t = 0; t < 25; t++) {
            for (int a = 0; a < 20; a++) {
                // System.out.println("Doing calculation....");
            }
        }
    }

}


Note:

  1. Here we have not implemented a synchronization mechanism, but you may need as per your requirment.
  2. Adjust the THREAD_COUNT as per your system configuration. On my system I have 4 core and 8 logical processors and I was able to achieve 100% utilization using 8 threads

enter image description here

Amit Meena
  • 2,884
  • 2
  • 21
  • 33