Maintaining unique objects for names with concurrent delete

Question

I am using the following programming idiom. I keep a synchronized HashMap with an association of names to objects. To lookup an object for a name I use the following code:

MyObject getObject(String name) {
   synchronized(map) {
      MyObject obj = map.get(name);
      if (obj == null) {
         obj = new MyObjec();
         map.put(name, obj);
      }
   }
}

When I then want to work exclusively on such an object I will use a synchronized on such an object:

synchronized(obj) {
    /* do something exclusively on obj (work type I) */
}

This has been working fine so far until recently. The new requirement is that there are type I and type II exclusive works. Type I will keep the object and type II should remove the object after it has completed the work. If I do something along the following:

synchronized(obj) {
    /* do something exclusively on obj (work type II) */
}
synchronized(map) { /* not good! */
   map.remove(obj);
}

I might grant some object some type I work, although the object has already been removed from the map. So basically the synchronized(obj) for type I work should be replaced by some new semaphore which rejoins the object to the map in case a type II work was granted before. Respectively the object should only leave the map when no sychronized are pending on it.

Best would be if the objects are not seen. I would go with an API with the names only. The objects are only used to maintain some state for the names. But the HashMap should be freed from the name after type II work has been completed. But during type I or type II work, the HashMap should not be locked.

Any ideas how to do that? Is this a known pattern?

Bye

I'm not sure I follow - could you describe exactly your requirement? First you say that type2 work should remove the object from the map when it's done. Then you go into describing some sort of queueing process for tasks. What exactly should the sequence of events be with, say, two threads. — Boris the Spider, Mar 03 '14 at 00:32
If each `Thread` only uses one name and carries out the sequence of work then can't you just store the data thread locally? I still don't quite understand the question. Where is the sequence stored? How is it allocated to the named objects? — Boris the Spider, Mar 03 '14 at 00:41
Okay - I think I get it. Use a `ConcurrentHashMap` first of all - don't lock the whole `Map`. Second wrap the keys in another `Object` and lock on that in the work process. This way you can lock the _key_ and replace the value. — Boris the Spider, Mar 03 '14 at 00:44
@CookieMonster - One clarification , do you store the data / the work in Map ?- In other words, Same MyObject instance will perform multiple Type1 Work by passing actual data to MyObject each time. Or you will be having runnable Type1,Type2,Type3 workers which will consume the MyObject and perform the operation ? - In other way , why do you need to lock on MyObject ? are you changing properties based on work outputs ? — Mani, Mar 18 '14 at 19:18
On one object either exclusively type 1 work or type 2 work is allowed to be done at any time. Even when there are multiple threads are running. Thats also the usual meaning of synchronized(obj). On the other hand multiple threads are allowed to work on different objects in parallel. — , Mar 26 '14 at 13:48

Boris the Spider · Answer 1 · 2014-03-03T10:25:19.707

The requirement seems to be this:

There is a Map<String, Object> that is a cache.
There are a number of worker threads in a pool the access the cache
Some types of work require the object in the cache to be invalidated when they are done

First you will need a ConcurrentHashMap<String, Lock> keys. This Map will store a relationship between the String keys and and Lock objects that we will use the lock the keys. This allows us to replace the key -> value mappings without locking the entire data Map.

Next you will need a ConcurrentHashMap<String, Object> data. This Map will store the actual mappings.

The reason to use a ConcurrentHashMap rather than a plain one is that it is thread safe. This means that manually synchronizing is not required. The implementation actually divides the Map into sectors and only locks the required sector to carry out operations - this makes it more efficient.

Now, the logic will be

putIfAbsent a new ReentrantLock into keys. This will, in a thread safe manner, check if a lock is already present for a key. If not a new one will be added, otherwise the existing one is retrieved. This means that there will only ever be one lock per key
Acquire a lock. This means that you gain exclusive access to a mapping.
Do work. In the case of TypeII remove the mapping from data after finishing.
Unlock the lock.

The code would look something like this:

private final ConcurrentHashMap<String, Object> data = new ConcurrentHashMap<>();
private final ConcurrentHashMap<String, Lock> keys = new ConcurrentHashMap<>();
private final ExecutorService executorService = null; //obviously make one of these

@RequiredArgsConstructor
private class TypeI implements Runnable {

    private final String key;
    private final Work work;

    @Override
    public void run() {
        final Lock lock = keys.putIfAbsent(key, new ReentrantLock());
        lock.lock();
        try {
            final Object value = data.get(key);
            work.doWork(value);
        } finally {
            lock.unlock();
        }
    }
}

@RequiredArgsConstructor
private class TypeII implements Runnable {

    private final String key;
    private final Work work;

    @Override
    public void run() {
        final Lock lock = keys.putIfAbsent(key, new ReentrantLock());
        lock.lock();
        try {
            final Object value = data.get(key);
            work.doWork(value);
            data.remove(key);
        } finally {
            lock.unlock();
        }
    }
}

public static interface Work {

    void doWork(Object value);
}

public void doTypeIWork(final String key, final Work work) {
    executorService.submit(new TypeI(key, work));
}

public void doTypeIIWork(final String key, final Work work) {
    executorService.submit(new TypeII(key, work));
}

I have used Lombok annotations to reduce the amount of clutter.

The idea is to minimise, or almost eliminate, the amount of common resource locking while still allowing a Thread to gain, if needed, exclusive access to a particular mapping.

To clean the keys Map you would need to guarantee that no work is currently ongoing and that no Threads would try and acquire any locks during the cleaning period. You could do this by attempting to acquire the relevant lock and then removing the mapping from the keys map - this would ensure no other thread was using the lock at the time.

You could run a scheduled task that clears, say, 20 keys from the map every X minutes. If you implemented it as an LRU cache then it should be fairly clean. Google Guava provide an implementation that you could use.

I have implemented something similar recently. The challenge here is to deal with very long running type I or type II works. How can a clean up deal with that? Proceed even in the precence of very long running type I or type II works. P.S.: This, the claim that there is long running work, should not be seen in conflict with my high frequency requirement. Besides type I and type II work, there are some readonly tasks, lets call them type III works, which are not critical and should be available anytime. — , Mar 03 '14 at 11:34
@CookieMonster you will have to wait the them to finish - if you remove the lock key that a currently running process has locked then you will allow another exclusive process with the same key to start. Not sure there is a way around that. — Boris the Spider, Mar 03 '14 at 13:46
BTW: JDK 8 has a new method in Map, computeIfAbsent. Should solve the performance issue with putIfAbsent. But it will break code running on JVMs before JDK 8. See also: http://stackoverflow.com/questions/19278443/how-do-i-use-the-new-computeifabsent-function — , Mar 03 '14 at 18:51
@CookieMonster yes, indeed. The introduction of lambdas in Java 8 solves maybe such issues. — Boris the Spider, Mar 03 '14 at 23:14
Instead of a regular Lock, use ReadWriteLock. Type I should use lock.getReadLock().lock(), and type II should use lock.getWriteLock().lock(). — ethanfar, Mar 17 '14 at 08:46
@eitanfar if you look at the edit history you will see that I proposed this as one of the first solutions. You will also see that the OP requires **exclusive access** to the cached resource so this isn't an appropriate solution. — Boris the Spider, Mar 17 '14 at 08:49
Does this provides guarantee on exclusive access on Object?- mean map is secured and no other threads can retrieve the object from Map. but once retrieved there is no guarantee the instance can be modified by the other thread ( since the objects is not locked) .- isn't ? — Mani, Mar 18 '14 at 18:25
@Mani I'm not sure you have understood the code. The lock is acquired from the `Map keys`. — Boris the Spider, Mar 18 '14 at 21:25
@BoristheSpider, I believe i understood your code. what i was trying say is , we do have monitor the Lock instance for the block , but out side of this block the Object can be accessed or modified by other threads. isn't ?. I think he wants exclusive lock on "value" object. I thought he is also using the object outside of this block thats why he wants exclusive lock on object ( value in data Map). — Mani, Mar 19 '14 at 14:05
@Mani any synchronization strategy depends on all users of the object in question adhere to it. I would suppose that the `Map` would remain private and users would submit tasks to it. The class that manages to `Map` would guarantee synchronized access. — Boris the Spider, Mar 19 '14 at 14:38
@BoristheSpider - Agreed. and thanks for your time to clarifying those. I raised this just because he used monitor as object in his example. — Mani, Mar 19 '14 at 14:44

Houcheng · Answer 2 · 2014-03-20T06:07:38.090

1) The first is to use a hash to store your data object, objHash.

2) You need a extra lock to assure the atomic exeuction of type1 and type2 operation on objHash.Type2 is write operations, and Type1 is read operations. You can use a readwrite lock and store the locks in a hash table, lockHash.

3) For assure the atomic operation of type1, type2 on the data object, you have to enclosed your type1/2 operation in a synchronzied statement that get a lock to this data object.

public class ConDeleteHash {
    ConcurrentHashMap <String, Object> objHash = new ConcurrentHashMap <String, Object> ();
    ConcurrentHashMap <String, ReentrantReadWriteLock> lockHash = new ConcurrentHashMap <String, ReentrantReadWriteLock> ();
    void Type1Op(String name) {
        ReadWriteLock rwl = lockHash.get(name);
        if(rwl==null) return;
        Lock lock = rwl.readLock();
        lock.lock();
        Object obj = objHash.get(name);
        if(obj==null) return;
        synchronized(obj) {
             System.out.println("TYPE1 to :"+ obj.toString());
        }
        lock.unlock();      
    }
    void Type2Op(String name) {
        ReadWriteLock rwl = lockHash.get(name);
        if(rwl==null) return;
        Lock lock = rwl.writeLock();
        Object obj = objHash.get(name);
        synchronized(obj) {
            System.out.println("TYPE2 to :"+ obj.toString());
        }
        lockHash.remove(name);
        objHash.remove(name);
        lock.unlock();
    }
    void add(String name, Object obj) {
        if(lockHash.get(name)!=null) return;
        objHash.put(name, obj);
        lockHash.put(name, new ReentrantReadWriteLock());
    }
}

Where are you seeing an `XXXUnlock()` method on [`ReadWriteLock`](http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/locks/ReadWriteLock.html)? This code is wrong and won't compile. Moreover, as the OP requires `exclusive` access to the cached objects for each operation the idea of the code is also wrong. — Boris the Spider, Mar 19 '14 at 13:54
Can "if (obj == null) return" be replaced by something else? Type 1 work should get always an object, possibly a freshly initialized one when before Type 2 work happened or when we are at the beginning. — , Mar 22 '14 at 09:46
What's the point of using a `ReadWriteLock` if you use a `syncronized` block anyway? This is just getting silly... — Boris the Spider, Mar 24 '14 at 14:42
The ReadWriteLock is used to gaurantee the atomic removal of objHash. You are right, the readwritelock can also gaurantee the atomic operation to obj. So, synchronized enclosure is not necessary. — Houcheng, Mar 25 '14 at 16:50

score 0 · Answer 3 · answered Mar 26 '14 at 14:54

Delay removing from map

... class MyObject{
    boolean active = true;
    ...
}


synchronized(obj) {
    if(obj.active){
        /* do something exclusively on obj */
        obj.active = false; //or not
    }
}


MyObject getObject(String name) {
   synchronized(map) {
      MyObject = map.get(name);
      if (obj == null) {
         obj = new MyObjec();
         map.put(name, obj);
      }else{
         synchronized(obj){
             if(!obj.active){
                 //any remove action here
                 obj = new MyObjec();
                 map.put(name, obj); // no previous obj in map
             }
         }
   }
}

The delete should be done when type II work is executed. So that when the thread returns from type II work, the object is already removed. Also the delete can take some time, should not block the map. — , Mar 26 '14 at 21:05

Mani · Answer 4 · 2014-03-26T21:41:55.407

How about this.. An slightly modified version of Boris the Spider.

Main Class with ConcurrentHashMap to work workers

public class Concurrent {
        // Hash map to hold workers
        final ConcurrentHashMap<String, Work> jobs = new ConcurrentHashMap<>();

Work Interface

    interface Work{
         void doWork(Object value);
    }

Base class for Blocking Works. mean Only one work can be done by this instance

 abstract class BaseWork implements Work {

    String name;

    Lock lock = new ReentrantLock();

    BaseWork(String name){
        this.name = name;
    }

    @Override
    public void doWork(Object value) {
        lock.lock();   // lock the following block
        try{
            if (jobs.get(name) != null) {    // Check in case there are waiting threads to perform work on this instance which is removed by completed Type11 Work
                performTask(value);
                System.out.println("Job Completed");
            }else{
                    jobs.putIfAbsent(name, new Type2Work(name)).doWork(value); // if new job has to be trigger. Note this section only possible when Type2Work, so created Type2Work

                System.out.println("Removed.. Job terminated");
            }
        }finally{
            lock.unlock(); // unlock this block , so other threads can start working
        }
    }
    abstract void performTask(Object value);    // Actual Job 
}

Here , name would be the same as key in the concurrentHashMap. As soon as doWork called, it would lock the block where the actual work got executed.

Type1 and Type 2 Implementation

  class Type1Work extends BaseWork{

    Type1Work(String name) {
        super(name);
    }

    @Override
    void performTask(Object value) {
        // Do type 1 Work
    }

}



   class Type2Work extends BaseWork{

    Type2Work(String name) {
        super(name);
    }

    @Override
    void performTask(Object value) {
        // Do Type 2 work.
        jobs.remove(name);
    }
}

Non Blocking Work - type 111 kind of ( doWork can perform work without sharing any information between threads)

    class NonLockingWork implements Work {

    @Override
    public void doWork(Object value) {
        // Do thread safe non blocking Work ( Type 111)
    }
}

Finally block to load the works into Map

          String key = "type1-name1";
    Work work = jobs.putIfAbsent(key, new Type1Work(key));
    work.doWork(new Object());



}

putifAbsent is an atomic operation. So it is completely safe. if the doWork called by any thread , then instance retrieved from map while it was live situvation. if you dont want to terminate the waiting threads and want to create new you can create new there .. — Mani, Mar 26 '14 at 21:17
With your change in the code, typeIWork will do typeIIWork, if the work was removed from the jobs. But typeIWork should always do typeIWork, even if the work was removed beforehand. — , Mar 26 '14 at 22:52
But i thought the Type1Work wont be removed. Even that case you can check the instance type and create that type instance in the same position. But it would be great if you give more info like why you want to remove Type2 and what kind of work you are performing Type1 and Type2 — Mani, Mar 27 '14 at 13:14
The named objects are not associated with a fixed type of work. The client decides which work to do with a name and the object that is behind the name. A client can decide at any time to either do type I work or do type II work on some name. And it is allowed to have multiple clients via multi-threading. And different clients might choose the same name. — , Mar 27 '14 at 17:25

score 0 · Answer 5 · answered Mar 27 '14 at 11:40

0

There is a problem with the logic of this question: Why are you deleting an object from the map if a user is allowed to ask for type 1 operations after type 2 operations? If a user is allowed to submit a type 1 after a type 2 there will always be cases where a type 1 is requested after an object is deleted. In which case why are you deleting it?

Is the issue that a client can only submit type one operations before type two operations, but you cannot guarantee that one will be carried out before the other by the executor service? In that case use a priority executor service and submit type 1 with a higher priority than type 2, as that will insure a type 1 is always started before a type 2 if both are pending. Since the type 2 cannot start until the type 1 is finished, this will mean that a delete is always carried out after type one provided type 1 is submitted before type 2.

This feels like a case where bad program design elsewhere has resulted in an insoluble dilemma. If you can explain how these odd specifications came about, then we might be able to craft a more durable solution.

Also, in this type of concurrency, don't delete objects from the map, replace them with a singleton object which is a subclass of your object, and have the doWork method check for this singleton, this is more reliable than null checking, as nulls can be created for lots of reasons, but your singleton object is passed to do work for a particular reason, which means that error tracing is easier later in development.

answered Mar 27 '14 at 11:40

phil_20686

4,000
21
38

A client might or might not issue type I work for a certain name again, after a client, not necessarily the same client, has issued type II work for the same name. I want to keep memory low. Otherwise the wasteful solution is trivial, and I have implemented it already. Basically I am currently going with a modified version of the two maps solution from Boris the Spider. But in his solution only one of the maps is shrinking, the other isn't. (There is a small side issue, the Java maps usually don't resize to smaller sizes, but I have a custom map which does this). – Mar 27 '14 at 12:24
Some priorization, in case two different clients issue two types of work at the same time for a certain name, whereby another cient is still working on this name is not necessary. This is not part of the requirement. Any fairness or priorization that comes for example with the synchronized primitive of Java is already enough. Also there is no requirement to associate null with a name, there is the invariant that a hash map get of null can be interpreted as a hash map miss. But a solution might use something else than hash maps, this is only how everything started. – Mar 27 '14 at 12:32
So is this map to behave like a cache? And the objects are available from some kind of database if they go missing? In that case you can implement memory efficient solutions with WeakReferences, which are references which do not prevent garbage collection, but the GC only collects the objects if heap memory is low. http://docs.oracle.com/javase/7/docs/api/java/util/WeakHashMap.html – phil_20686 Mar 27 '14 at 14:20
I am still having trouble understanding the requirements. A client is issuing work which (presumeably)* alters the state of the object, and another client is allowed to do work which also alters the state of the object and neither knows which job gets carried out first? This does not sound like a correct specification. How can a user know if the object the work was carried out on was unchanged? *If the object's state is not to be changed, e.g. using the internal state to generate another object, make your objects immutable. – phil_20686 Mar 27 '14 at 14:24
1) Its not really a cache, since the objects that sit behind the names, do not get reclaimed when memory is low. But WeakHashMap could neverthless go into the picture, Boris the Spider and I didn't yet figure out how. 2) What gets changed during work, is the state of the object, means inside the synchronized statement as posted in the question, where it is highlighted /* do something exclusively on obj */, the do something means that fields of the object are read and written. Also methods are called. So having immutable objects here wouldn't serve the requirement. – Mar 27 '14 at 17:22

score 0 · Accepted Answer · answered Mar 27 '14 at 19:08

You could use an AtomicInteger to keep track of the number of tasks in progress on the object. Then, for type II tasks, only remove the object if there are no tasks in progress:

class MyObject {
   private AtomicInteger worksInProgress = new AtomicInteger(0);
   public int incWIP() {
      return worksInProgress.incrementAndGet();
   }
   public int decWIP() {
      return worksInProgress.decrementAndGet();
   }
   public int getWIP() {
      return worksInProgress.get();
   }
   ...
}

MyObject getObject(String name) {
   synchronized(map) {
      MyObject obj = map.get(name);
      if (obj == null) {
         obj = new MyObject();
         map.put(name, obj);
      }
      obj.incWIP(); // assume you're doing work on this starting now
   }
}

Work type I would look like:

MyObject obj = getObject(name);
synchronized(obj) {
   obj.workI();
}
obj.decWIP(); // finished doing work type I

And type II would look like:

MyObject obj = getObject(name);
synchronized(obj) {
   obj.workII();
}
if (obj.decWIP() == 0) { // finished with this work and all others
   synchronized(map) {
      // double-check the value because we checked previously without the map lock
      if (obj.getWIP() == 0) {
         map.remove(obj);
      }
   }
}

Cool, this could be really something. I guess there is also a obj.decWIP() before the first if (obj.decWIP()==0). — , Mar 27 '14 at 19:16

Maintaining unique objects for names with concurrent delete

6 Answers6

Linked