0

My question is how I would implement multithreading to this task correctly.

I have a program that takes quite a long time to finish executing. About an hour and a half. I need to generate about 10,000 random and unique number codes. The code below is how I first implemented it and have it right now.

import java.util.Random;
import java.util.ArrayList;

public class Main
{
    public static void main(String[] args) {
        
        Random random = new Random();
        
        // This holds all the codes
        ArrayList<String> database = new ArrayList<>();
        
        int counter = 0;
        while(counter < 10000){
            
            // Generate a 10 digit long code and append to sb
            StringBuilder sb = new StringBuilder();
            for(int i = 0; i < 10; i++){
                sb.append(random.nextInt(10));
            }
            String code = String.valueOf(sb);
            sb.setLength(0);
            
            // Check if this code already exists in the database
            // If not, then add the code and update counter
            if(!database.contains(code)){
                
                database.add(code);
                counter++;
            }
        }
        
        System.out.println("Done");
    }
}

This of course is incredibly inefficient. So my question is: Is there is a way to implement multithreading that can work on this single piece of code? Best way I can word it is to give two cores/ threads the same code but have them both check the a single ArrayList? Both cores/ threads will generate codes but check to make sure the code it just made doesn't already exist either from the other core/ thread or from itself. I drew a rough diagram below. Any insight, advice, or pointers is greatly appreciated.

Petetunze
  • 35
  • 4
  • Are the calls to the database your possible bottleneck? If so, you can do the checks for duplication in the Java code itself and then run a single batch update to the database. – stepanian Mar 21 '22 at 07:29
  • Are you interested specifically in parallelizing this, or just in making it faster? In any case, the first step would be to profile to find out where to improve. I suspect parallelization has limited potential here because of the uniqueness constraint, but replacing your ArrayList with a HashSet could already give a significant boost, for example. – OhleC Mar 21 '22 at 07:30
  • 1
    Oh, also, representing your codes as Strings and building them up character by character (and needing to compare strings for uniqueness checks later) is probably extremely inefficient as well. – OhleC Mar 21 '22 at 07:32
  • IMHO the most efficient way would be to replace the `ArrayList database = new ArrayList<>();` with `HashSet database = new HashSet<>();` because this changes `!database.contains(code)` from O(N) to O(1). – Thomas Kläger Mar 21 '22 at 07:35

3 Answers3

2

Start with more obvious optimizations:

  1. Do not use ArrayList, use HashSet. ArrayList contains() time complexity is O(n), while HashSet is O(1). Read this question about Big O summary for java collections framework. Read about Big O notation.
  2. Initialize your collection with appropriate initial capacity. For your case that would be:
new HashSet<>(10000);

Like this underlying arrays won't be copied to increase their capacity. I would suggest to look/debug implementations of java collections to better understand how they work under the hood. Even try to implement them on your own.

Before you delve into complex multithreading optimizations, fix the simple problems - like bad collection choices.

Edit: As per suggestion from @Thomas in comments, you can directly generate a number(long) in the range you need - 0 to 9_999_999_999. You can see in this question how to do it. Stringify the resulting number and if length is less than 10, pad with leading zeroes.

Chaosfire
  • 4,818
  • 4
  • 8
  • 23
  • 1
    I'd also add trying to reduce the calls to `random.nextInt(...)`. Instead of 10 individual digits one could generate 2 random 5-digit numbers or - if using a RNG that can produce long values - a single 10-digit number right away. Then just pad with leading zeros if required. – Thomas Mar 21 '22 at 07:41
  • @Thomas Good one, made an edit to include it. – Chaosfire Mar 21 '22 at 08:45
2

Using a more appropriate data structure and a more appropriate representation of the data, this should be a lot faster and easier to read, too:

Set<Long> database = new HashSet<>(10000);
while(database.size() < 10000){
    database.add(ThreadLocalRandom.current().nextLong(10_000_000_000L);
}
OhleC
  • 2,821
  • 16
  • 29
0

Example: (use ConcurrentHashMap, use threads, use random.nextLong())

 public class Main {
        static Map<String,Object>  hashMapCache =  new ConcurrentHashMap<String,Object>();


    public static void main(String[] args) {

        Random random = new Random();

        // This holds all the codes
        ArrayList<String> database = new ArrayList<>();

        int counter = 0;
        int NumOfThreads = 20;
        int total = 10000;
        int numberOfCreationsForThread = total/NumOfThreads;
        int leftOver = total%NumOfThreads;
        List<Thread> threadList = new ArrayList<>();

        for(int i=0;i<NumOfThreads;i++){
            if(i==0){
                threadList.add(new Thread(new OneThread(numberOfCreationsForThread+leftOver,hashMapCache)));
            }else {
                threadList.add(new Thread(new OneThread(numberOfCreationsForThread,hashMapCache)));
            }
        }
        for(int i=0;i<NumOfThreads;i++){
            threadList.get(i).start();;
        }
        for(int i=0;i<NumOfThreads;i++){
            try {
                threadList.get(i).join();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }
        for(String key : hashMapCache.keySet()){
            database.add(key);
        }

        System.out.println("Done");
    }}

OneThread:

public class OneThread implements Runnable{

int numberOfCreations;
Map<String,Object> hashMapCache;



public OneThread(int numberOfCreations,Map<String,Object> hashMapCache){
    this.numberOfCreations = numberOfCreations;
    this.hashMapCache = hashMapCache;
}
@Override
public void run() {
    int counter = 0;
    Random random = new Random();

    System.out.println("thread "+ Thread.currentThread().getId() + " Start with " +numberOfCreations);
    while(counter < numberOfCreations){
        String code = generateRandom(random);
        while (code.length()!=10){
            code = generateRandom(random);
        }
        // Check if this code already exists in the database
        // If not, then add the code and update counter
        if(hashMapCache.get(code)==null){
            hashMapCache.put(code,new Object());
            counter++;
        }
    }
    System.out.println("thread "+ Thread.currentThread().getId() + " end with " +numberOfCreations);
}

private static String generateRandom(Random random){
    return String.valueOf(digits(random.nextLong(),10));
}
/** Returns val represented by the specified number of hex digits. */
private static String digits(long val, int digits) {
    val = val > 0 ? val : val*-1;
    return Long.toString(val).substring(0,digits);
}

}

Sarel Foyerlicht
  • 907
  • 6
  • 11