0

I have some code that is running a load tests against a web service by spinning up multiple threads and hitting the service with a specified transaction at a given rate. The transaction retrieves a list of values from the service, then checks the list of values to see if they exist in a set, and adds them if they do not or fails the transaction if they do (I'm aware the separte check is not necessary and the return value of the add could be inspected- that's just how the code is written now).

Looking at the code however, it is not thread safe. The set being checked against/added to is a basic HashSet. The current code also increments a value in a regular hashMap for each transation- so it looks like this code has been mesesed up from the beginning when it comes to thread safety.

I believe I solved the Map increment issue using ConcurrentHashMap based solution here: Atomically incrementing counters stored in ConcurrentHashMap, but I'm not sure the best way to handle the duplicate check/modification on the Set in a thread-safe way.

Originally I considered using CopyOnWriteArraySet, but because the expected case is to get no duplicates, the reads would occur as frequently as writes, so it doesn't seem ideal. The solution I'm considering now is to use a Set 'view' on ConcurrentHashMap using newKeySet()/KeySet(defaultVal) as described here: https://javarevisited.blogspot.com/2017/08/how-to-create-thread-safe-concurrent-hashset-in-java-8.html

If I use this solution checking for duplicates by just adding the value and checking the bool return type, will this achieve what I want in a thread-safe way? My main concern is that it is important that I DO detect any duplicates. What I don't want to happen is two threads try to add at the same time, and both adds return true since the value was not there when they attempted to add and the duplicate values received from the service goes undetected. For that purpose I thought maybe I should use a List and check for duplicates at the end by converting to a set and checking size? However it's still preferable to at least attempt to detect a duplicate during the transaction and fail if detected. It's fine to get a false negative sometimes and still pass the transaction if we can detect it at the end, but I think that check/failing transaction when we can is still valuable.

Any advice is appreciated- thanks!

ryoaska
  • 307
  • 5
  • 17
  • Please show your code for interacting with the ConcurrentHashMap. If you use `compute`, you have exclusive access for the duration of executing the BiFunction. – Andy Turner Jun 13 '18 at 16:32
  • "I considered using `CopyOnWriteArraySet`, but ... it doesn't seem ideal" Have you profiled this? Have you measured that the performance does not meet your requirements? You've already ruled out something which will probably work fine. – Michael Jun 13 '18 at 16:33
  • http://idownvotedbecau.se/nocode/. ["*Premature Optimization is the root of all evil"* - Donald Ervin Knuth](http://wiki.c2.com/?PrematureOptimization) – Turing85 Jun 13 '18 at 16:36
  • No code! The checks that you are mentioning happens on the client/test side. Can't you keep it simple even if it is less performant. – gagan singh Jun 13 '18 at 16:50

3 Answers3

2

I believe I solved the Map increment issue using ConcurrentHashMap based solution here: Atomically incrementing counters stored in ConcurrentHashMap, but I'm not sure the best way to handle the duplicate check/modification on the Set in a thread-safe way.

Yes you can certainly use a ConcurrentHashMap in your solution.

If I use this solution checking for duplicates by just adding the value and checking the bool return type, will this achieve what I want in a thread-safe way?

Yes. ConcurrentHashMap is a fully reentrant class so if two threads are doing a put(...) of the same key at the same instant, one of them will win and return null as the existing key and the other will replace the key and return the previous value for the key that you can test on. It is designed specifically for high performance multi-threaded applications. You can also do a putIfAbsent(...) in which case the 2nd thread (and any others) will return the value already in the map. This would also work if you are using a keyset wrapper to supply Set mechanics.

With all synchronized classes, you need to be careful about race conditions in your code when you make multiple calls to the class. For example, something like the following is a terrible pattern because there is a race condition because of the multiple calls to the concurrent-map:

// terrible pattern which creates a race condition
if (!concurrentMap.containsKey(key)) {
   concurrentMap.put(key, value);
}

This is the reason why the ConcurrentMap has a number of atomic operations that help with this:

  • V putIfAbsent(K key, V value); -- put key into map if it is not there already
  • boolean remove(K key, V value); -- remove the key from the map if it has value
  • boolean replace(K key, V oldValue, V newValue); -- replaces key with new-value only if it already has old-value
  • V replace(K key, V value); -- replace the value associated with the key only if key already exists in the map

All of these methods would require multiple, non-atomic calls to the synchronized map to implement from outside which would introduce race conditions.

My main concern is that it is important that I DO detect any duplicates. What I don't want to happen is two threads try to add at the same time, and both adds return true...

As mentioned above, this won't happen. One of the 2 puts will return null and the other one should be counted as a duplicate.

For that purpose I thought maybe I should use a List and check for duplicates at the end by converting to a set and checking size?

The list would be unnecessary and very hard to get right.

Gray
  • 115,027
  • 24
  • 293
  • 354
0

I think a ConcurrentHashSet-like set is your best friend:

Set<Value> values = ConcurrentHashMap.newKeySet();

The set is backed by an ConcurrentHashMap so your code would both benefit from thread-safety and performance of ConcurrentHashMap

Mạnh Quyết Nguyễn
  • 17,677
  • 1
  • 23
  • 51
0

Just a little advise - if your Transaction object (or whatever you put into Set) has proper equals method implementation you do not need to check duplicates in the Set.

Set always has only unique values. If you still need to know is object already in the set use contains method.

Then there are multiple ways to do what you need.

  • You can use ConcurrentHashMap instead of Setjust put your objects as keys. You have a keySet there and you can use it. Value can be anything (e.g. same object). Sure you can use valueSet instead as well.

  • You can use one of the BlockingQueue (e.g. LinkedBlockingQueue) implementation to collect transactions from different threads first and then apply any logic you want after all threads done

and there are many other ways...

Vadim
  • 4,027
  • 2
  • 10
  • 26