How to make client-server socket applications access data in a synchronized way and test for it

Question

I have this simple server-socket application using TCP that allows multiple clients to connect to a server and store and retrieve data from an ArrayList. Everytime a client connection is accepted into the server, a new thread is created to handle the client activity, and since all clients must interact with the same data structure I decided to create a static ArrayList as follows:

public class Repository {
    private static List<String> data;

    public static synchronized void insert(String value){
       if(data == null){
           data = new ArrayList<>();
       }
        data.add(value);
    }

    public static synchronized List<String> getData() {
        if(data == null){
            data = new ArrayList<>();
        }
        return data;
    }
}

So, everytime a client inserts a value or reads the list they just call Repository.insert(value) or Repository.getData() from their respective threads.

My questions are:

Making these methods synchronized is enough to make the operations thread safe?
Are there any architectural or performance issues with this static List thing? I could also create an instance of the List in the server class (the one that accepts the connections) and send a reference via contructor to the Threads instead of using the static. Is this any better?
Can Collections.synchronizedList() add any value to such a simple task? Is it necessary?
How could I test the thread-safety in this scenario? I tried just creating multiple clients and make them access the data and everything seems to work, but I'm just not convinced... Here is a short snippet of this test:

            IntStream.range(0,10).forEach(i->{
            Client client = new Client();
            client.ConnectToServer();
            try {
                client.SendMessage("Hello from client "+i);
            } catch (IOException e) {
                e.printStackTrace();
            }});

            //assertions on the size of the array

Thanks in advance! :)

Your `getData()` method is dangerous. It gives any class, any package, any thread the ability to modify the `data` list without notifying or cooperating with your `Repository` class. You might just as well have made the `data` variable `public`. — Solomon Slow, Oct 29 '21 at 18:05
Re, "How could I test the thread-safety in this scenario?" First of all, the fact that your `getData()` method exists means that it _isn't_ thread safe. (Your `Repository` class has no ability to control what different threads might do to the list and when.) Second of all, testing thread-safety generally is a _hard problem._ Just because a program is "unsafe," that' doesn't mean it is _guaranteed_ to fail. You could test the **** out of a program, then ship it to a customer, and it fails first-time for them because their system is configured differently from yours,... — Solomon Slow, Oct 29 '21 at 18:10
...Or, it could run for half a year at the customer site, and then fail when they upgraded their OS. (Don't _ask_ me how I know about that one!!) — Solomon Slow, Oct 29 '21 at 18:11
Also, FYI, and not that it matters much in this case, but generally speaking, `static` is the enemy of testing. Google for "dependency injection" and "mock object" to learn more. — Solomon Slow, Oct 29 '21 at 18:14
@SolomonSlow you are completely right! I want to return the current state of the list to the client on demand, so i should probably make the `getData` method return a String version of it instead of the whole thing. — Iosiv, Oct 29 '21 at 19:29
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. — Community, Oct 30 '21 at 04:25

score 1 · Accepted Answer · answered Oct 30 '21 at 01:29

Yes, see https://stackoverflow.com/a/2120409/3080094
Yes (see comments further below) and yes. Using one object (a singleton if you like) is preferred over static methods (the latter are, in general, harder to maintain).
Not necessary but preferred: it avoids you making mistakes. Also, instead of:

private static List<String> data;

you can use

private static final List<String> data = Collections.synchronizedList(new ArrayList<>());

which has three benefits: final ensures all threads see this value (thread-safe), you can remove the code for checking on a null-value (which can be error-prone) and you no longer need to use synchronized in your code since the list itself is now synchronized.

Yes, there are ways to improve this to make it more likely you find bugs but when it comes to multi-threading you can never be entirely sure.

About the " architectural or performance issues": every read of the list has to be synchronized so when multiple clients want to read the list, they will all be waiting for a lock to read the entire list. Since you are only inserting at the end and reading the list, you can use a ConcurrentLinkedQueue. This "concurrent" type (i.e. no need to use synchronized - the queue is thread-safe) will not lock when the entire list is read, multiple threads can read the data at the same time. In addition, you should hide the details of your implementation which you can do, for example, with an Iterator:

import java.util.Queue;
import java.util.concurrent.ConcurrentLinkedQueue;

    private final Queue<String> dataq = new ConcurrentLinkedQueue<>();
    
    public Iterator<String> getData() {
        return dataq.iterator();
    }

About "testing the thread-safety": focus on the code that needs to be thread-safe. Using clients to connect to a server to test if the Repository code is thread-safe is inefficient: most of the test will just be waiting on I/O and not actually using Repository at the same time. Write a dedicated (unit) test just for the Repository class. Keep in mind that your operating system determines when threads start and run (i.e. your code for threads to start and run are just hints to the operating system), so you will need the test to run for a while (i.e. 30 seconds) and provide some output (logging) to ensure threads are running at the same time. Output to console should only be shown at the end of the test: in Java output to console (System.out) is synchronized which in turn can make threads work one after the other (i.e. not at the same time on the class under test). Finally, you can improve the test using a java.util.concurrent.CountDownLatch to let all threads synchronize before executing the next statements concurrently (this improves the chance of finding race-conditions). That is a bit much for this already long answer to explain, so I'll leave you with a (admittedly complicated) example (focus on how the tableReady variable is used).

How to make client-server socket applications access data in a synchronized way and test for it

1 Answers1