2

Assume that we have a method doSomething(String input) and we want to run it synchronized by different inputs.

It means that running doSomething(A) should block any consecutive calls of doSomething(A) until the first one is completed BUT should not block doSomething(B) or doSomething(C).

So i've created a wrapper method to achieve this goal. It creates objects based on input value and places locks on them and keeps a reference to them in a list.

private static final ArrayList<String> runningTasks  = new ArrayList<>();


public void doSomethingSyncedByInput(String input) {

    // Create a lock or load an already created lock from the list.
    // (Yeah, it's a race condition but forget about it. It's just an example.)
    String lock = new String(input);
    if(runningTasks.contains(input)){
        // get currently available lock object
        lock = runningTasks.get(runningTasks.indexOf(input));
    }else {
        // add a reference on tasks list
        runningTasks.add(lock);
    }

    synchronized (lock) {
        doSomething(input);
    }
}

It actually works; but it's not a totally thread-safe solution as ArrayList is not thread-safe. The contents of ArrayList is not volatile and according to documentation, adding and removing items on the list is not reflected on other threads instantly.

Note that this implementation is not synchronized. If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally.

A well known thread safe variant of ArrayList is CopyOnWriteArrayList (which makes a copy of the elements and re-sets the inner volatile field holding elements to be sure to have latest version of the list on all other threads instantly). As the name yields, it COPIES each one of the list items on adding a new item to the list and it means the references to the actual objects that got the lock on, would be lost and the code will be broken.

So I need a datatype which holds a list of objects and doesn't copy or change objects in favor of concurrency support. I could use String.intern() functionality but it brings a lot of potential problems especially on mobile platforms.

Any ideas?

Or, do you know any bulletproof implementations available to use?

I'm on Android platform by the way.

UPDATE:

I've reached a solution by myself. Check out my own answer. Comments are welcomed.

zxcmehran
  • 1,435
  • 14
  • 24
  • `String.valueOf(String)` does nothing, it will just return the original object. – markspace Apr 29 '18 at 16:04
  • Now you have a race condition where you check the `runningTasks` with `indexOf` and then later add the value if it is not found. – markspace Apr 29 '18 at 16:27
  • See [name based lock](https://stackoverflow.com/questions/5639870/simple-java-name-based-locks/28723518#28723518) and [aquire lock by a key](https://stackoverflow.com/questions/11124539/how-to-acquire-a-lock-by-a-key/11125602#11125602) – xingbin Apr 29 '18 at 16:32
  • seeing as you're always locking on something, 2 calls on 2 threads trying to acquire the SAME lock means they will block anyway as your code has a `synchronized` block that is used regardless (from what I understand you want initial synchronization on a lock object, then you don't care for subsequent calls to that object)? Also as you already know the arraylist access isn't ttread safe – Mark Apr 29 '18 at 17:41
  • @markspace Yeah but it's not the main concern and implementing a safe workaround for this would not be a headache. So i'm focusing on the problem itself. – zxcmehran Apr 29 '18 at 17:58
  • @MarkKeen I want it not to run `doSomething(input)` on two inputs with the same string value. Specifying other inputs should not block. – zxcmehran Apr 29 '18 at 18:01
  • Name *ONE* problem that String.intern causes. – G. Blake Meike Apr 29 '18 at 18:49
  • @G.BlakeMeike We've discussed a bit under your answer, but for more info you can check https://stackoverflow.com/questions/133988/synchronizing-on-string-objects-in-java – zxcmehran Apr 29 '18 at 18:57
  • The discussion in that post says only that interning *might* be a problem. You realize, I suppose OP's goal *requires* a solution that is essentially equivalent to interning. – G. Blake Meike Apr 30 '18 at 00:19

3 Answers3

0

You don't need a data type. You need a semaphore dedicated to the locking code. Take a semaphore before you check contains. Figure out if you're in the list. Add yourself to the list if needed. Release it after. You now have a synchronous way of obtaining any lock.

(This is also basically what any synchronized type will do- except by doing the semaphore yourself you can synchronize multiple operations at a time, such as checking contains then adding to the list. Its a frequent bug to use a synchronized data type, expect it will fix your problems, only to find that you needed to synchronize at a higher level around multiple functions).

Of course your code has other weaknesses. You're never removing old keys from the list as they finish, so you could have unbounded size there. Also contains on a list is an N operation, a hashmap is better if you care more about speed than memory.

Gabe Sechan
  • 90,003
  • 9
  • 87
  • 127
  • I dont get that how a semaphore is going to help. It might help in order to keep the atomicity of transactions done on the list but the main problem of `ArrayList` is that the changes to the list is not reflected on the instance of this object on other threads, and `ArrayList.contains()` might return false while the item is added already on another thread. – zxcmehran Apr 29 '18 at 16:49
  • @zxcmehran You use the semaphore around any access to the array list. That way only one thread can access the array list at a time. Since all threads need to take the semaphore, then no other thread can add anything to the list. This is what the synchronized keyword does- it takes a semaphore when you enter the function/block, and releases it when you exit. Except by using it yourself, you aren't limited to a single function call per semaphore access allowing threads to get in in between calls. – Gabe Sechan Apr 29 '18 at 16:52
  • @zxcmehran "Changes to this list are not reflected on the instance of this object on other threads"- no such thing. In your code there's a single array list. Its shared by all threads. It can be edited by multiple threads, but those changes will appear immediately, minus any caching of previous results in registers.. which isn't an issue with your code (its usually an issue when you check variables repeatedly looking for changes, and is generally fixed via using volatile, but this isn't one of those issues). – Gabe Sechan Apr 29 '18 at 16:56
  • "its usually an issue when you check variables repeatedly looking for changes" - The method might be called in any conditions, so it should be considered important. And no, changes to `ArrayList` is not reflected immediately. I've updated the question. – zxcmehran Apr 29 '18 at 17:35
  • @zxcmehran no, it isn't as you don't cache around function boundaries. You really don't understand what you're talking about – Gabe Sechan Apr 29 '18 at 17:37
  • Okay, so I'll be thankful if you have any links or examples about the special caching case you're talking about. – zxcmehran Apr 29 '18 at 18:35
  • Umm. @GabeSechan, I think you don't understand the Java memory model very well. You are quite mistaken in your assertion that changes to an ArrayList will appear immediately on all threads. – G. Blake Meike Apr 30 '18 at 00:12
  • @G.BlakeMeike it's a global variable. The changes are there instantly. You don't need a special data structure to see them. There are cases where you need volatile, but the above use case isn't one – Gabe Sechan Apr 30 '18 at 00:22
  • You are completely mistaken about that. The memory model only promises program order, within a thread. – G. Blake Meike Apr 30 '18 at 01:54
  • @G.BlakeMeike You write to a global variable via calling a function (for instance, add). That's a shared reference between each thread. They don't all have a deep copy of the object. That means your next call to get that variable will return the new value. The only way it would not is if it was cached in a register, but that doesn't happen across function boundaries (it could happen if you were accessing a literal directly and it wasn't volatile, but that isn't the case here). You will immediately see those newly added items in the next call to contains. – Gabe Sechan Apr 30 '18 at 01:58
  • @G.BlakeMeike And for fun, just tested it on mobile and desktop. You're wrong. Now if you were *assigning the arraylist global* you'd be right. It would need to be volatile for that to work, as the pointer can be cached rather than requested from memory each time. But we aren't. – Gabe Sechan Apr 30 '18 at 02:00
  • 1
    You simply don't understand the Java Memory Model. Running the output of one compiler on a couple machines, once, is not proof of anything. Don't take it from me. Brian Goetz, Android Concurrency in Practice: "If multiple threads access the same mutable state variable without appropriate synchronization, your program is broken." I'm not given to debating on SO, but I'm pretty tired of folks getting this simple thing, wrong. – G. Blake Meike Apr 30 '18 at 02:45
  • @G.BlakeMeike ran it 10k times on each. It's ok for you to admit you're wrong. Or just leave – Gabe Sechan Apr 30 '18 at 02:46
  • "You really don't understand what you're talking about" – G. Blake Meike Apr 30 '18 at 03:46
  • @G.BlakeMeike I get that you don't. But that's ok, you can learn. – Gabe Sechan Apr 30 '18 at 03:47
  • @GabeSechan I recommend you to learn more about `CopyOnWriteArrayList` and it's fundamental differences with `ArrayList`. Check out their source codes and you'll see that `ArrayList` is set up defining elements holder like `transient Object[] elementData;` and `CopyOnWriteArrayList` defines the holder like `private transient volatile Object[] elements;` It uses `volatile` keyword at raw element storage level. It means `ArrayList` items are not stored in a `volatile` manner. THAT'S the big difference which makes `ArrayList` not suitable for a multithreaded environment. – zxcmehran Apr 30 '18 at 09:41
  • @zxcmehran I know all about it. It makes a copy of the array on every write. There are cases where that's necessary, but its overkill for yours. All you needed was synchronization, as the docs itself says that class is for when synchronization isn't possible. It allows you to traverse without synchronization. Since you can synchronize its pointless, although it won't hurt you if the number of times this function is called with different strings is relatively low. – Gabe Sechan Apr 30 '18 at 12:08
0

Unless I misunderstand your question, this is a trivial thing to do:

<T, R> R doSomethingSynchronizedByInput(T input, Function<T, R> fn) {
    synchronized(input) { return fn.apply(input); }
}

Amended to note lack of specificity in question and to add two solutions:

To do the same thing, on the class/type of the input, you need only slightly modify the code above:

<T, R> R doSomethingSynchronizedByInput(T input, Function<T, R> fn) {
    synchronized(input.getClass()) { return fn.apply(input); }
}

Finally, doing something similar for values that are .equals (this is, apparently, what you mean by "value") is slightly more complex. Something along these lines should work:

// Note: I have not tested this: it is just a sketch.
// It requires that type T have an "equals" method that divides
// it into "values"
// Don't try to use a Comparator, because the HashMap doesn't.
public class DoSomethingSynchedByInput<T, R> {
    public interface Listener<V> { void accept(V val); }

    private final Map<T, LinkedList<T>> waiting = new HashMap<>();

    void doSomething(final T input, Function<T, R> fn, Listener<R> listener)
        throws InterruptedException {
        synchronized (waiting) {
            LinkedList<T> waitList = waiting.get(input);
            if (waitList == null) { waitList = new LinkedList<>(); }
            waitList.addLast(input);
            waiting.put(input, waitList);
            while (true) {
                if (waitList.peekFirst() == input) { break; }
                waiting.wait();
            }
        }
        try { listener.accept(fn.apply(input)); }
        finally {
            synchronized (waiting) {
                LinkedList<T> waitList = waiting.get(input);
                waitList.getFirst();
                if (waitList.size() > 0) { waiting.notifyAll(); }
                else { waiting.remove(input); } 
            }
        }
    }
}

Note that there are all kinds of issues, here, that you are not specifying. Several people have brought up the issue that it may be dangerous to lock on a public object (because someone else might). That might be problem, or it might be a design goal. Of much more concern, to me, is that this solution is blocking threads instead of queuing tasks. A blocked thread is a pretty expensive way to represent queued work.

My suggestion is that you take the solutions above and rethink your problem. But, hey, that's not what you asked.

G. Blake Meike
  • 6,615
  • 3
  • 24
  • 40
  • That's not necessarily correct since synchronized lock is done by Object, not by Value. I've proposed the whole `ArrayList` thing to overcome this problem. – zxcmehran Apr 29 '18 at 17:17
  • What is "Value". Do you mean Type/Class? I think you may misunderstand something... – G. Blake Meike Apr 29 '18 at 17:34
  • Oh... wait... Are you saying you want it to block if two objects are `.equals`? If that's all, and T == String, just intern the string. – G. Blake Meike Apr 29 '18 at 17:36
  • @G.BlakeMeike Locking on interned string is a static solution. Anything else could easily lock on this string. – lexicore Apr 29 '18 at 17:49
  • @G.BlakeMeike Moreover, i'm on a mobile platform and memory management is a serious problem even if it gets GCed after some time (a lot of GCs makes the app a laggy trash on weak CPUs). – zxcmehran Apr 29 '18 at 17:52
  • @lexicore I have no idea what you mean by "static solution". You are certainly correct, though, that this solution could block or be blocked by other threads synchronizing on the object to which "input" is a reference. – G. Blake Meike Apr 29 '18 at 18:42
  • @G.BlakeMeike Static in a sense as if you'd have a public static pool of locks, available globally. – lexicore Apr 29 '18 at 19:18
0

NOTE: For the solution, skip to the UPDATE part below.

As i dived deeper in CopyOnWriteArrayList class, i noticed that CopyOnWriteArrayList.add() uses Arrays.copyOf() which makes a shallow copy of the elements list. It means that it only copies the array itself, not the elements inside. They're just passed on to the new array. So, the lock objects stay intact and we can be sure that the object retrieved via

runningTasks.get(runningTasks.indexOf(input))

is exactly the same object we added on runningTasks and locked on, and the list is on it's latest version on all threads immediately after editing anhy of list items.

I ran the experiment with structural changes to the list, just to be sure:

CopyOnWriteArrayList<String> l = new CopyOnWriteArrayList<>();
String s1 = new String("foo");
String s2 = new String("bar"); 
String s3 = new String("bar"); // Different object, same value

l.add(s1);
Log.e("TEST", "Result: "+String.valueOf( s1 == l.get(0) ));

l.add(s2);
Log.e("TEST", "Result: "+String.valueOf( s1 == l.get(0) ));
Log.e("TEST", "Result: "+String.valueOf( s2 == l.get(1) ));

l.add(s3);
Log.e("TEST", "Result: "+String.valueOf( s1 == l.get(0) ));
Log.e("TEST", "Result: "+String.valueOf( s2 == l.get(1) ));
Log.e("TEST", "Result: "+String.valueOf( s3 == l.get(2) ));

l.remove(1); // the s2
Log.e("TEST", "Result: "+String.valueOf( s1 == l.get(0) ));
Log.e("TEST", "Result: "+String.valueOf( s2 == l.get(1) )); // should be false
Log.e("TEST", "Result: "+String.valueOf( s3 == l.get(1) )); // should be true 

And the results were:

E/TEST: Result: true
E/TEST: Result: true
E/TEST: Result: true
E/TEST: Result: true
E/TEST: Result: true
E/TEST: Result: true
E/TEST: Result: true
E/TEST: Result: false
E/TEST: Result: true

So the references to original objects would be kept. Thus the code can be modified to use CopyOnWriteArrayList. The final version would be:

private static final CopyOnWriteArrayList<String> runningTasks  = new CopyOnWriteArrayList<>();


public void doSomethingSyncedByInput(String input) {

    String lock;
    int index;

    synchronized (runningTasks) {
        index = runningTasks.indexOf(input);
        if (index >= 0) {
            // get currently available lock object
            lock = runningTasks.get(index);
        } else {
            // add a reference on tasks list
            lock = new String(input);
            runningTasks.add(lock);
        }
    }

    synchronized (lock) {
        if(!runningTasks.contains(lock)){
            runningTasks.add(lock);
        }

        doSomething(input);

        index = runningTasks.indexOf(lock);
        if(index >= 0)
            runningTasks.remove(index);
    }
}

It's not perfect though.

Feedbacks are welcomed.

UPDATE

I managed to implement a better one. This one is completely thread-safe, prevents race conditions and cleans up memory after usage.

public class ParameterSynchronizer <T> {

    private final CopyOnWriteArrayList<T> objects;
    private final ConcurrentHashMap<T, Integer> lockCounter;

    public ParameterSynchronizer(){
        objects = new CopyOnWriteArrayList<>();
        lockCounter = new ConcurrentHashMap<>();
    }

    public T getLockObject(T input){
        synchronized (objects) {
            T lock = input;
            int index = objects.indexOf(lock);
            if (index >= 0) {
                lock = objects.get(index);
                lockCounter.put(lock, lockCounter.get(lock)+1);
            } else {
                objects.add(lock);
                lockCounter.put(lock, 1);
            }
            return lock;
        }
    }

    public void cleanUpLockObject(T input){
        synchronized (objects) {
            T lock = input;
            int counter = lockCounter.get(lock);
            if(counter == 1) {
                objects.remove(objects.indexOf(lock));
                lockCounter.remove(lock);
            }else{
                lockCounter.put(lock, counter - 1);
            }
        }
    }

}

Usage:

You should create a final static field having an instance of this class. Use getLockObject() to get the object needed for synchronized block. In end of the synchronized block (last line, in finally, before return, etc.), run cleanUpLockObject() to clear memory. Both methods should be called just once per execution per thread, as calling them changes the thread counter.

It keeps a track of how many threads are locking on this object, and clears the object if no other threads are locking on it.

private static ParameterSynchronizer<String> ps = new ParameterSynchronizer<>();

public void doSomethingSyncedByInput(String input){
    String lockObject = ps.getLockObject(input);
    synchronized (lockObject) {
        doSomething(input);
        ps.cleanUpLockObject(lockObject);
    }
}

And just in case, if doSomething() throws, it could be catched like

private static ParameterSynchronizer<String> ps = new ParameterSynchronizer<>();

public void doSomethingSyncedByInput(String input) throws Exception {
    String lockObject = ps.getLockObject(input);
    synchronized (lockObject) {
        try {
            doSomething(input);
        } catch(Exception e) {
            throw e;
        } finally {
            ps.cleanUpLockObject(lockObject);
        }
    }
}
zxcmehran
  • 1,435
  • 14
  • 24