3

I've read through Java Concurrency in Practice and am left with this question: when I use a ConcurrentHashMap, what data concurrency issues discussed in Part One of the book do I still need to worry about? Here are a couple of examples from one of my programs:

1. Current position of a trader (a shared integer where 'integer' is the mathematical term)

This number represents what a trader object currently owns and defines its state. It must read its position to know what to do (look to start a new position, or manage the current one). Trader methods run on their own thread.

A broker object is in charge of setting the trader's position. It will set the position each time one of the trader's orders is filled. Broker methods run on their own thread.

Both the trader and broker are in the same package. Position is implemented as a package-private static ConcurrentHashMap. The keys are id's of the trader objects. The values are Integer.

External to the package is the application. It gets the traders' positions indirectly with a public getter.

Positions will change at most once every few minutes, so the broker won't touch the map very often. However, the trader and application will frequently read. In addition we often have several traders reading the map concurrently.

So using a ConcurrentHashMap this way, I don't have to work about locking and data visibility? The ConcurrentHashMap takes care of everything?

2. The market (bid, ask, last prices)

Pretty much the same situation as position, except now the broker will very frequently update the prices (up to 10 updates a second during busy times; normally a few times a second). The trader and application still do frequent reads. The map keys now are codes indicating which stock or future, and the values are objects which hold the market prices.

It seems to work okay, but after reading JCIP, I realize the program can still be broken if things are not implemented correctly. The book talks about the ConcurrentHashMap but doesn't explicitly tell me what issues from Part I we no longer have to address manually. It appears that I don't have to synchronize anything in this case. Is that correct?

Pete
  • 16,534
  • 9
  • 40
  • 54
  • 1
    I'd encourage you to reword your question. This assumes that everyone has read the book. This will limit the number of responses you get. – David Weiser Dec 29 '10 at 18:17

3 Answers3

5

So using a ConcurrentHashMap this way, I don't have to work about locking and data visibility? The ConcurrentHashMap takes care of everything?

This depends on what is in the Map, if I read your example correctly the situation looks like this

static final ConcurrentMap<Integer,Integer> map = ...

class Trader{

  public int doRead(){
      map.get(someId);
   }
}
class Broker{
   public void doWrite(){
      map.put(someId,someValue);
   }
}

If that is the case, then yes all the concurrency is taken care of.

However if the map looks like

static final ConcurrentMap<Integer,Trader> map = ..

    class Broker{
       public void doWrite(){
          map.get(someId).setPosition(somePosition);
       }
    }

This is NOT thread safe, even though the ConcurrentHashMap locks when you put, all concurrent access of objects at this point must handle their own synchronization.

John Vint
  • 39,695
  • 7
  • 78
  • 108
  • So the problem in the second example is we have the Broker acquiring a lock, then an alien call to the Trader who needs the lock to the same map? – Pete Dec 29 '10 at 20:09
  • The problem is that the map is thread safe, but any mutable objects in the map are not thread safe. – Peter Lawrey Dec 29 '10 at 21:00
  • 3
    @Pete, as Peter Lawrey mentioned - mutable objects are not thread safe unless they themselves are synchronized correctly. Once you put into the CHM you lose any new synchronization points (a lock acquisition). All CHM gets (well almost all) will never acquire the put-lock so if a field of an object the map is holding has changed other threads may not see that change. – John Vint Dec 29 '10 at 21:21
4

Yes, ConcurrentHashMap takes care of visibility and locking as long as:

  • values held by map are immutable. Seems to be true in your description, given that your prices objects are immutable;
  • you don't have operations on map which must be atomic and can't be expressed as single calls to map's API. For example if you need operation like 'read value from map, perform calculation, and put the result back on the map' to be atomic, you still need to hold explicit lock during this operation, or, better yet, change application so you only use atomic operations of map API such as get/put/putIfAbsent.
Victor Sorokin
  • 11,878
  • 2
  • 35
  • 51
3

Unless you are talking about over 100,000 updates/reads per second (very rough guide) I wouldn't consider using multiple threads. The reason is that thread-safe components take many times longer than components which are not. So if a component take 5x longer to be thread safe, you need to be using over 5 threads concurrently to break even, never mind go faster.

Multiple threads are much more useful when you have relatively expensive operations to perform. Updating a position or price is much more efficient as one thread.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • In this case going with separate threads segmented the code in a logical way and made it easier to maintain and develop. There is only a little bit of shared, mutable data so I think the trade-off is well worth it. Speed is not much of an issue since I am not taking part in the madness known as "HFT" – Pete Dec 29 '10 at 21:17
  • If speed is not much of an issue, I would use one high level lock for the collection and everything in it. (and make sure the access is as quick as possible, no calls to IO etc) This is the simplest option. – Peter Lawrey Dec 29 '10 at 21:24
  • Correct, the cost of the thread context switch is significant, so you need to make sure that whatever work is being done is sufficiently costly to be worth it. Here's an interesting link describing the actual cost: http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html – Jed Wesley-Smith Jan 03 '11 at 08:31
  • 1
    May I know why the number 100,000? I'm taking the statement as a correct estimation because you're 496k :) but shouldn't that number depend on the expensiveness of the task? And by the way, doesn't creating a thread pool alleviate this expensiveness? – aderchox Apr 26 '21 at 10:04
  • @aderchox I agree, the 100K is very ballpark and depends on a number of factors. It could be 10K or 1 million for your particular case. I assume you are using a Thread pool. – Peter Lawrey Apr 27 '21 at 11:04