2

Can someone explain the output of the following program:

public class DataRace extends Thread {
    static ArrayList<Integer> arr = new ArrayList<>();

    public void run() {
        Random random = new Random();
        int local = random.nextInt(10) + 1;
        arr.add(local);
    }

    public static void main(String[] args) {
        DataRace t1 = new DataRace();
        DataRace t2 = new DataRace();
        DataRace t3 = new DataRace();
        DataRace t4 = new DataRace();

        t1.start();
        t2.start();
        t3.start();
        t4.start();

        try {
            t1.join();
            t2.join();
            t3.join();
            t4.join();
        
        } catch (InterruptedException e) {
            System.out.println("interrupted");
        }

        System.out.println(DataRace.arr);

    }
}

Output:

  • [8, 5]
  • [9, 2, 2, 8]
  • [2]

I am having trouble understanding the varying number of values in my output. I would expect the main thread to either wait until all threads have finished execution as I am joining them in the try-catch block and then output four values, one from each thread, or print to the console in case of an interruption. Neither of which is really happening here.

How does it come into play here if this is due to data race in multithreading?

dreamcrash
  • 47,137
  • 25
  • 94
  • 117
amuu00
  • 71
  • 5

3 Answers3

5

The main problem is that multiple threads are adding to the same shared ArrayList concurrently. ArrayList is not thread-safe. From source one can read:

Note that this implementation is not synchronized.
If multiple threads access an ArrayList instance concurrently, and at least one of the threads modifies the list structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more elements, or explicitly resizes the backing array; merely setting the value of an element is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the list. If no such object exists, the list should be "wrapped" using the Collections.synchronizedList method. This is best done at creation time, to prevent accidental unsynchronized access to the list:

In your code every time you call

arr.add(local);

inside the add method implementation, among others, a variable that keeps track of the size of the array will be updated. Below is shown the relevant part of the add method of the ArrayList:

private void add(E e, Object[] elementData, int s) {
    if (s == elementData.length)
        elementData = grow();
    elementData[s] = e;
    size = s + 1; // <-- 
}

where the variable field size is:

/**
 * The size of the ArrayList (the number of elements it contains).
 *
 * @serial
 */
private int size;

Notice that neither is the add method synchronized nor the variable size is marked with the volatile clause. Hence, suitable to race-conditions.

Therefore, because you did not ensure mutual exclusion on the accesses to that ArrayList (e.g., surrounding the calls to the ArrayList with the synchronized clause), and because the ArrayList does not ensure that the size variable is updated atomically, each thread might see (or not) the last updated value of that variable. Hence, threads might see outdated values of the size variable, and add elements into positions that already other threads have added before. In the extreme, all threads might end-up adding an element into the same position (e.g., as one of your outputs [2]).

The aforementioned race-condition leads to undefined behavior, hence the reason why:

System.out.println(DataRace.arr);

outputs different number of elements in different execution of your code.

To make the ArrayList thread-safe or for alternatives have a look at the following SO thread: How do I make my ArrayList Thread-Safe?, where it showcases the use of Collections.synchronizedList()., CopyOnWriteArrayList among others.

An example of ensuring mutual exclusion of the accesses to the arr structure:

public void run() {
    Random random = new Random();
    int local = random.nextInt(10) + 1;
    synchronized (arr) {
        arr.add(local);
    }
}

or :

static final List<Integer> arr = Collections.synchronizedList(new ArrayList<Integer>());

  public void run() {
      Random random = new Random();
      int local = random.nextInt(10) + 1;
      arr.add(local);
  }
dreamcrash
  • 47,137
  • 25
  • 94
  • 117
1

TL;DR

ArrayList is not Thread-Safe. Therefore it's behaviour in a race-condition is undefined. Use synchronized or CopyOnWriteArrayList instead.

Longer answer

ArrayList.add ultimately calls this private method:

    private void add(E e, Object[] elementData, int s) {
        if (s == elementData.length)
            elementData = grow();
        elementData[s] = e;
        size = s + 1;
    }

When two Threads reach this same point at the "same" time, they would have the same size (s), and both will try add an element on the same position and update the size to s + 1, thus likely keeping the result of the second. If the size limit of the ArrayList is reached, and it has to grow(), a new bigger array is created and the contents copied, likely causing any other changes made concurrently to be lost (is possible that multiple threads will be trying to grow).

Alternatives here are to use monitors - a.k.a. synchronized, or to use Thread-Safe alternatives like CopyOnWriteArrayList.

Anderson
  • 169
  • 5
0

I think there is a lot of similar or closely related questions. For example see this.

Basically the reason of this "unexpected" behabiour is because ArrayList is not thread-safe. You can try List<Integer> arr = new CopyOnWriteArrayList<>() and it will work as expected. This data structure is recommended when we want to perform read operation frequently and the number of write operations is relatively rare. For good explanation see What is CopyOnWriteArrayList in Java - Example Tutorial.

Another option is to use List<Integer> arr = Collections.synchronizedList(new ArrayList<>()).

You can also use Vector but it is not recommended (see here). This article also will be useful - Vector vs ArrayList in Java.

dim
  • 992
  • 11
  • 26