0

I have requirement where need to process and map the DTOs with the values in for loop as below. Each of the mapping method here consumes nearly 10 minutes to complete its business logic and hence creating performance delay. I am working to refine the algorithms of business logic. However, please let me know if each of these mapping methods can be parallel processed to increase performance.

Since application is compatible only with Java 7 I cannot use streams of java 8.

for(Portfolio pf : portfolio) {
   mapAddress(pf);
   mapBusinessUnit(pf);
   mapRelationShipDetails(pf)
   --
   --
   --
}
Skanda
  • 835
  • 1
  • 15
  • 32

1 Answers1

0

You could split portfolios to different threads using either Runnable or Callable.

For example:

public class PortfolioService implements Callable<List<Portfolio>>
{
    List<Portfolio> portfolios;
    public PortfolioService(List<Portfolio> portfolios)
    {
        this.portfolios = portfolios;
    }


    public List<Portfolio> call()
    {
    for(Portfolio pf : portfolios) {
       mapAddress(pf);
       mapBusinessUnit(pf);
        ...
        }   
     return portfolios;
    }
}

However, this needs some modifications in your main class. I am using Callable here, since I don't know if you want to do something with all of these mapped Portfolios afterwards. However, if you want to let the threads do all of the work and don't need any return, use Runnable and modify the code.

1) You have to get your amount of cores:

int threads = Runtime.getRuntime().availableProcessors();

2) Now you split the workload per thread

// determine the average workload per thread
int blocksize = portfolios.size()/threads;
// doesn't always get all entries 
int overlap = portfolios.size()%threads;

3) Start an ExecutorService, make a list of Future Elements, make reminder variable for old index of array slice

ExecutorService exs = Executors.newFixedThreadPool(threads);
List<Future<List<Portfoilio>>> futures = new ArrayList();
int oldIndex = 0;

4) Start threads

for(int i = 0; i<threads; i++)
{
    int actualBlocksize = blocksize;
    if(overlap != 0){
    actualBlocksize++;
    overlap--;
    }
    futures.add(exs.submit(new PortfolioService(portfolios.subList(oldIndex,actualBlocksize));
    oldIndex = actualBlocksize;
}

5) Shutdown the ExecutorService and await it's termination

exs.shutdown();
try {exs.awaitTermination(6, TimeUnit.HOURS);}
catch (InterruptedException e) { }

6) do something with the future, if you want / have to.

maio290
  • 6,440
  • 1
  • 21
  • 38
  • will try this. I have this logic triggered when called from rest service api. I also want to load test the API using Jmeter for nearly 1000 concurrent requests. Should I consider any changes here for accommodating the concurrency. Please advise – Skanda Jan 12 '18 at 10:01
  • In this example, I think actualBlocksize should be outside for loop and it should increment as actualBlocksize+= blocksize; in for loop. I see this will yield the results. Also, please help me know about this overlap logic here. – Skanda Jan 12 '18 at 18:37
  • please let me know if applying fork/join concept will occur any issues? – Skanda Jan 13 '18 at 15:08
  • Hi! An example for the overlap logic: Say you have 8 cores and an array with 9 entries, the blocksize would be 1 and you would compute 8 entries, but you want to have the 9th as well. So you add +1 to the actualBlocksize, until you have no more overlap left. You can actually modify the source code (not saying it's perfect) like you need it. It was just to give an example on how to use the Callable interface. Normally, there shouldn't be any issues here, only for a list < cores (missing handling for that). – maio290 Jan 14 '18 at 11:25
  • Okay getting clear. Please let me know your thoughts on load test as I mentioned. Should I consider and caution extra challenges theoretically. – Skanda Jan 14 '18 at 13:21
  • Without knowing your application I can only guess - when you have 1000 concurrent requests and all of these requests are calling a code, which occupies all available cores (1000*cores), then this might kill your application completely. So this is really only good for one instance and splitting the work for the one instance. Just to understand you properly: You're currently waiting 10 minutes * mapping * portfolio count per call? Then you may consider something like persistence (JPA) as well. – maio290 Jan 14 '18 at 13:48
  • I did try that using JPA. I have portfolio entity which consists of different other entities mapping. Earlier, executing the query and getting the resultset list of portfolio was time-consuming. So I marked mappings with LAZY initializations and could get some good performance while fetching the data. Now, in my mapAddress() and other map methods, while I call their attributes like (ex: address.getCity()), again lots of queries are gettings executed(since these were marked as LAZY in portfolio object) so I had to think for parallel processing. – Skanda Jan 14 '18 at 15:23