1

Suppose I want to read the huge file where each line represents domain objects, and I need to store this information in cache. This file is read using multiple threads. Each thread is reading certain range of lines and they will put mapped object in List. At the end when all the submitted tasks are finish, you should have full list with all objects from file.

  1. CopyOnWriteArrayList I cant use as it creates copy on each write so load would be too much on memory ArrayList : I can use new
  2. ArrayList for each task and insert objects read by task in its local Araylist and return it as Future. When all tasks are done, I will merge all ArrayList to one. Here no of ArrayList equal to number of Task I have created.

Is there any better concurrent List data structure I can use for storing objects ?

Koret
  • 87
  • 4

3 Answers3

0

Not really. Your ArrayList strategy is as good as it gets and is e.g. equivalent to what parallelStream().collect(toList()) does.

Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
0

CopyOnWriteArrayList may not be a correct candidate for this scenario

  1. The given use-case seems to be write heavy (write-only followed by read-only)
  2. CopyOnWriteArrayList might generally allow only a single writer at any given time (though readers can be execute concurrently)

In scenario, using CopyOnWriteArrayList might have worse performance than synchronizedList or Vector. Ref SO: synchronizedList

For the current use case, as pointed by @Louis Wasserman, it's better to populate isolated list in each thread and then finally combine the lists.

  1. the combination step can be time/space consuming due to reallocation.
  2. can be slightly optimized for time/space by initializing the result list with combined size of individual lists (avoid resizing of result internally)
Thiyanesh
  • 2,360
  • 1
  • 4
  • 11
0

you can use guava cache/caffine,because huge file will make more gc.

liveM
  • 1
  • 4