7

I want to implement a strict round robin scheduling when i send requests to an external system. There are two external system servers. The first request should go to 'System1' and the second request must go to 'System2' and next one to 'System1' and so on.

Since i have only two servers to send the request to, and since i want maximum performance without any blocking and context switch, i have gone for AtomicBoolean since it makes use of CAS operation.

My implementation classes

1. RoundRobinTest.java

package com.concurrency;

import java.util.Iterator;

public class RoundRobinTest 
{
    public static void main(String[] args) 
    {
        for (int i = 0; i < 500; i++) 
        {
            new Thread(new RoundRobinLogic()).start();
        }
        try 
        {
            // Giving a few seconds for the threads to complete
            Thread.currentThread().sleep(2000);
            Iterator<String> output = RoundRobinLogic.output.iterator();
            int i=0;
            while (output.hasNext()) 
            {
                System.out.println(i+++":"+output.next());
                // Sleeping after each out.print 
                Thread.currentThread().sleep(20);
            }
        } 
        catch (Exception ex) 
        {
            // do nothing
        }
    }

}

2.RoundRobinLogic.java(Class with static AtomicBoolean object)

package com.concurrency;

import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedDeque;
import java.util.concurrent.atomic.AtomicBoolean;

public class RoundRobinLogic implements Runnable 
{
    private static AtomicBoolean bool = new AtomicBoolean(true);

    public static Queue<String> output = new ConcurrentLinkedDeque<>();

    @Override
    public void run() 
    {
        if(bool.getAndSet(false))
        {
            // Sending the request to first system
            output.add("Request to System1");
        }
        else if(!bool.getAndSet(true))
        {
            // Sending the request to first system
            output.add("Request to System2");
        }       
    }

}

Output:


......................
314:Request to System1
315:Request to System2
316:Request to System1
317:Request to System2
318:Request to System1
319:Request to System1
320:Request to System2
321:Request to System2
322:Request to System1
323:Request to System2
324:Request to System1
325:Request to System2
......................

The requests 318 and 319 had been sent to the same server and AtomicBoolean fails in this scenario. For my application 1000-2000 threads might be accessing the shared object at a time. From Java concurrency in practice, i have seen the below.

at high contention levels locking tends to outperform atomic variables, but at more realistic contention levels atomic variables outperform locks. This is because a lock reacts to contention by suspending threads, reducing CPU usage and synchronization traffic on the shared memory bus. With low to moderate contention, atomics offer better scalability; with high contention, locks offer better contention avoidance. (CAS based algorithms also outperform lock based ones on single CPU systems, since a CAS always succeeds on a single CPU system except in the unlikely case that a thread is preempted in the middle of the read modify write operation.)

Now i have the below questions.

  1. Is there any other efficient non-blocking way, to achieve the round-robin request sending.
  2. Under heavy contention, is it possible for the AtomicBoolean to fail ? What i understand is that performance/throughput might go down due to heavy contention. But in the above example AtomicBoolean fails . Why ?
Albin
  • 371
  • 1
  • 4
  • 18
  • 1
    You have 1k-2k threads running at the same time? How many processors are you running this on? – Sean Bright Apr 13 '15 at 18:52
  • 3
    Access to `bool` and `queue` is not synchronized even though it looks like their states are depending on each other. – Mick Mnemonic Apr 13 '15 at 18:56
  • just a thought, if this is really what you want, and it's not a simplification of the real problem, it's ok to have irregularities once in a while, since in large numbers the irregularities will even out and you'll get the same load on both servers. I like perfection, but sometimes, it simply isn't needed. Now you can go back to trying to get this done right ;) – TheZuck Apr 13 '15 at 18:56
  • @MickMnemonic just gave you the answer. – Zielu Apr 13 '15 at 18:58
  • 1
    Your `run` method doesn't take advantage of CAS. Consider this: it is possible that neither call to `output.add` in your `run` method will be executed due to timing issues. – Sean Bright Apr 13 '15 at 19:08
  • maybe you can use [this](http://stackoverflow.com/a/12629700/4020264) adding 1 to an atomic long and using the remainder modulo 2. – 1010 Apr 13 '15 at 19:13
  • @SeanBright: I did not understand how output.add will be affected by timing issues. Can you explain ? – Albin Apr 14 '15 at 08:11
  • the order of execution of your code is not guaranteed between threds. if two threads execute `getAndSet` and afterwards their `output.add` they may add the same System. or if a thread is about to execute the else and another thread sets `true` before, the first thread would exit the second if. – 1010 Apr 14 '15 at 12:55

3 Answers3

11

As an aside to John's answer, a cleaner and perhaps slightly more efficient implementation of RoundRobinLogic would use AtomicInteger or AtomicLong. This removes the need to compare the current value of the AtomicBoolean with the new value:

class RoundRobinLogic implements Runnable
{
    private static final AtomicInteger systemIndex = new AtomicInteger(1);

    public static final Queue<String> output = new ConcurrentLinkedDeque<>();

    @Override
    public void run()
    {
        if (systemIndex.incrementAndGet() % 2 == 0) {
            // Sending the request to first system
            output.add("Request to System1");
        } else {
            // Sending the request to second system
            output.add("Request to System2");
        }
    }
}

And this would allow you to extend this to additional systems fairly easily:

class RemoteSystem
{
    private final String name;

    RemoteSystem(String name)
    {
        this.name = name;
    }

    public String name()
    {
        return name;
    }
}

class RoundRobinLogic implements Runnable
{
    private static final AtomicInteger systemIndex = new AtomicInteger(1);

    private static final RemoteSystem[] systems = new RemoteSystem[] {
        new RemoteSystem("System1"),
        new RemoteSystem("System2"),
        new RemoteSystem("System3"),
        new RemoteSystem("System4"),
    };

    public static final Queue<String> output = new ConcurrentLinkedDeque<>();

    @Override
    public void run()
    {
        RemoteSystem system = systems[systemIndex.incrementAndGet() % systems.length];

        // Sending the request to right system
        output.add("Request to " + system.name());
    }
}
Sean Bright
  • 118,630
  • 17
  • 138
  • 146
  • for the requests to be perfectly alternated as OP wants, shouldn't `incrementAndGet` and `output.add` be synchronized? – 1010 Apr 13 '15 at 19:58
  • @1010 They should be, which is why the OP cannot get the exact output he would want without blocking. I'd hope the OP has the code up as an example. – John Vint Apr 13 '15 at 20:02
  • Right... and what happens if the requests to system1 take longer than the ones to system2? Right now it is just a fixed string, but in actuality it is going to be opening a socket, doing some I/O, etc. The method presented here only ensures that "about half" of the requests go to each of the two systems. – Sean Bright Apr 13 '15 at 20:14
  • In actual scenario, i don't need the 'output' queue. So your solution will work. I would think about this solution as well as @JohnVint's solution. – Albin Apr 14 '15 at 02:50
  • @Albin, please note that this solution also suffers from the race condition between the modulo evaluation and `output.add()` call (as was mentioned in the comments) -- the exact same problem you had in your original version. So it doesn't really implement the required "strict round robin". The timing issue becomes apparent when you add a small lag in the beginning of the `run` method: `try { Thread.sleep(20); } catch (InterruptedException e) { throw new RuntimeException(e); }` and run it with `RoundRobinTest`. – Mick Mnemonic Apr 14 '15 at 09:09
  • @MickMnemonic: I have added the queue only for testing my round robin implementation with multiple threads. But in the actual scenario, i am not using Queue. Instead of output.add, i will be sending a request to the external system. – Albin Apr 14 '15 at 10:43
  • I will edit the answer to remove the Queue. As JohnVint mentioned in his answer, output.add() will be replaced by the request to external system in the actual scenario. – Albin Apr 14 '15 at 10:52
  • 1
    @Albin, it does not really matter whether there is a queue or an external system you are calling. Unless you syncronize the two method calls, a perfect alternation between the service calls -- which I though was the whole point of your question -- is not guaranteed. – Mick Mnemonic Apr 14 '15 at 13:39
  • 1
    Correct. There is no guarantee of "perfect alternation." The only thing the code above guarantees is that given `n` external systems and `x` requests, `x / n` requests will go to each system. As I commented on your answer, wrapping the call to the external system with an intrinsic lock will significantly reduce throughput. – Sean Bright Apr 14 '15 at 13:45
4

Let's assume you are not using a Queue but an api to an actual system. The issue I see is related to:

    if(bool.getAndSet(false))
    {
        // Sending the request to first system
        output.add("Request to System1");
    }
    else if(!bool.getAndSet(true))
    {
        // Sending the request to second system
        output.add("Request to System2");
    }     

What if both conditionals fail? How is that possible? Imagine upon entering the first if the bool is true. Then you try to set it to false, but another thread beat you to it so you see false. You then try the else if. Now what if the else if when you get there is false, but set to true buy another thread? In that case both tries will fail.

I would refactor this to look like:

while(true){
  boolean current = bool.get();
  if(bool.compareAndSet(current, !current){
     if(current){ 
        //send request to first system
     } else {
        //send request to second system
     }
     return;
  }
}

As Sean Bright mentioned, because your are adding to a queue, even if you implement it like I have above, you may still see some values out of order because the Queue itself is not part of the synchronization with the AtomicBoolean.

Dmitry Zagorulkin
  • 8,370
  • 4
  • 37
  • 60
John Vint
  • 39,695
  • 7
  • 78
  • 108
  • 1
    I started working on my own answer but yours covers the broad strokes. I think it is important to mention that the results that are `add()`ed to the `Queue` will not necessarily be in the 1, 2, 1, 2, etc. order that the OP wants to see. Making that work would require more synchronization. – Sean Bright Apr 13 '15 at 19:27
  • 2
    I think the values returned by `current` and `!current` are not part of the atomic operation. – 1010 Apr 13 '15 at 19:29
  • @1010 Why do you think that? – John Vint Apr 13 '15 at 19:31
  • The only thing that the `get()`/`compareAndSet()` pairing gives you is that "you" (the current thread) actually changed the value (edit: but now that I think about it, while that is true, the value of `bool` could completely toggle between the `get()` and `compareAndSet()` calls). Dumping the `AtomicBoolean` in favor of an `AtomicLong` as @1010 pointed out in his comments on the question itself simplifies things. – Sean Bright Apr 13 '15 at 19:33
  • sorry, I thought that your bool was the original AtomicBoolean – 1010 Apr 13 '15 at 19:33
  • 1
    @SeanBright That's a fair point. 1010 should add that as an answer. I won't highjack is answer. – John Vint Apr 13 '15 at 19:34
  • I'm just commenting, feel free to do what you want with my thoughts. – 1010 Apr 13 '15 at 19:43
  • I have accepted Sean Bright's answer, since it was slightly cleaner implementation and scalable. Thanks for quick solution by using AtomicBoolean. Thanks to @1010 for the valuable comments. – Albin Apr 14 '15 at 08:04
  • When i compare @SeanBright's solution with this one, i feel the while loop,get(),compareAndSet() calls creates more contention for the sharedObject, where as if we use an AtomicLong/AtomicInteger with a incrementAndGet(), the contention could be less. – Albin Apr 14 '15 at 08:08
  • 1
    @Albin You're right. From a probabilistic viewpoint you are much more likely to see race failures here than compared with SeanBright's solution. I think his is better, tbh. That being said, I am keeping my answer up to illustrate where your race condition was. – John Vint Apr 14 '15 at 13:11
1

Since your requirement is basically: implement an atomic operation that

  1. evaluates and flips a boolean (or evaluates a modulo and increments a counter in the generic n servers case) and
  2. inserts an entry into a queue based on the result of step 1,

you cannot really accomplish that by making steps 1 and 2 individually thread-safe; you have to synchronize steps 1 and 2 together.

Here's a simple implementation that should work:

import java.util.LinkedList;
import java.util.Queue;

public class RoundRobinLogic implements Runnable 
{
    private static boolean bool = true;
    public static final Queue<String> OUTPUT = new LinkedList<String>();
    private static final Object LOCK = new Object();

    @Override
    public void run() {
        synchronized (LOCK) {
            OUTPUT.add(bool ? "Request to System1" : "Request to System2");
            bool = !bool;
        }
    }
}

Regarding your questions:

  1. You cannot avoid blocking if you need to synchronize two higher-than-processor-level operations together. Classes in java.util.concurrent.atomic employ machine-level atomic instructions, which is why code that uses these classes (usually, depending on platform) does not need to block.
  2. In your implementation, AtomicBoolean did not fail. Instead, there was a race condition between reading the boolean and adding an element into the queue.
Mick Mnemonic
  • 7,808
  • 2
  • 26
  • 30
  • While this will satisfy the artificial requirement that the `String`s in `output` would be in an alternating order, using an intrinsic lock that is shared by all threads will introduce more contention. Replace `OUTPUT.add(bool...)` with `OUTPUT.add(someHttpClient.makeRequest(...))` and things get out of control fairly quickly. Making `OUTPUT` a `Queue>` is an option too. – Sean Bright Apr 14 '15 at 13:05
  • How is that an artificial requirement? The OP asked for a "strict round robin". His original solution seemed to get the alteration correct a vast majority of times, so I don't really get the question if strictness wasn't a real requirement. – Mick Mnemonic Apr 14 '15 at 14:57
  • OP has said multiple times that the `output.add()` will be replaced by a call to an external system. And I don't know how to be any more clear on this point, but if your `OUTPUT.add` was replaced by a call to an external system, every other thread will block on the intrinsic lock, meaning that only a single request could be occurring at one time. So yes, your code will "work" in the sense that the `Queue` will be in the right order, but that isn't really what OP wants. @Albin, feel free to chime in here. – Sean Bright Apr 14 '15 at 15:08