-2

I have a database record of around 1000000 paragraphs with around ~500 characters each. By reading all the records, I need to get the list of alphabet ordered by most to least used.

I mock the database reading by creating stream up to 1000000 then process the stream in parallel

final Map<Character, Long> charCountMap = new ConcurrentHashMap<>();
for (char c = 'a'; c <= 'z'; c++) {
    charCountMap.put(c, 0l);
}

System.out.println("Parallel Stream");
long start = System.currentTimeMillis();
Stream.iterate(0, i -> i).limit(1000000).parallel() //mock database stream
    .forEach(i-> RandomStringUtils.randomAlphanumeric(500)
    .toLowerCase().chars().mapToObj(c -> Character.valueOf((char) c)).filter(c -> c >= 97 && c <= 122)
    .forEach(c -> charCountMap.compute(c, (k, v) -> v + 1))); //update ConcurrentHashMap

long end = System.currentTimeMillis();
System.out.println("Parallel Stream time spent :" + (end - start));

System.out.println("Serial Stream"); start = System.currentTimeMillis();
Stream.iterate(0, i -> i).limit(1000000) //mock database stream
    .forEach(i-> RandomStringUtils.randomAlphanumeric(500)
    .toLowerCase().chars().mapToObj(c -> Character.valueOf((char) c)).filter(c -> c >= 97 && c <= 122)
    .forEach(c -> charCountMap.compute(c, (k, v) -> v + 1)));
end = System.currentTimeMillis();
System.out.println("Serial Stream time spent :" + (end - start));

I initially thought that parallel stream would be faster even with expected overhead for stream larger than 100,000. However, test shows that serial stream is ~5X faster than parallel even for 1,000,000 records.

I suspected it was because of updating the ConcurrentHashMap. But when I removed it and change with empty function, there is still significant performance gap.

Is there something wrong in my database mock up call or the way I use parallel stream?

NothingBox
  • 345
  • 5
  • 15
  • Going by memory here, so take it with a grain of salt, but when calculating elapsed time, you should use `System.nanoTime()`, not `System.currentTimeMillis()`. If you have to do benchmarking on you own, this [question](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) is a great source. – Chaosfire May 21 '22 at 11:45
  • 1
    You can't make any conclusions without warm-up runs and proper benchmarks. Just try swapping over the order of serial and parallel code sections, and you may come to the opposite conclusion because the first iteration test is slowest. Also the random generator may not be threadsafe (cannot tell as you don't show the code) which may mean that the parallel version can never be quickest. – DuncG May 21 '22 at 15:37
  • I have swapped the order and got the same result. The RandomStringUtils I am using is from Apache commons-lang library – NothingBox May 21 '22 at 15:56

1 Answers1

0

Using RandomStringUtils.randomAlphanumeric(500) isn't suitable for use with parallel() as according to the code here it uses a static variable for the random string generation. Therefore all the calls from all the threads to generate a random string will have contention on the same underlying instance of Random:

private static final Random RANDOM = new Random();

Write your own random string generator which uses single instance of Random per thread or has use of java.util.concurrent.ThreadLocalRandom - this avoids contention on the random sequences. The same issue causes poor performance in this question before it was edited to use ThreadLocalRandom.

See javadoc for java.util.Random says:

Instances of java.util.Random  are threadsafe.
However, the concurrent use of the same java.util.Random 
instance across threads may encounter contention and consequent
poor performance. Consider instead using 
java.util.concurrent.ThreadLocalRandom in multithreaded
designs.
DuncG
  • 12,137
  • 2
  • 21
  • 33