0

The function ArrayList.add() works very fast. I checked the source code, and saw the implement was Arrays.copyOf()

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

But when I use the method Arrays.copyOf() in my code, it becomes very slow. You can just run the code below to see it:

public class TestArrayCopy {
  public static void main(String[] args) {
    long t = System.currentTimeMillis();
    List<Integer> list = new ArrayList<>();
    for (int i = 0; i < 100000; i++) {
      list.add(i, i);
    }
    System.out.println(System.currentTimeMillis() - t);

    t = System.currentTimeMillis();
    Integer[] array = new Integer[0];
    for (int i = 0; i < 100000; i++) {
      array = Arrays.copyOf(array, array.length +1);
      array[i] = i;
    }
    System.out.println(System.currentTimeMillis() - t);
  }
}

Any ideas?

Federico klez Culloca
  • 26,308
  • 17
  • 56
  • 95
Valeamor
  • 53
  • 1
  • 6

5 Answers5

3

If you are asking why the two loops have a different running time (the second being much slower), the reason is that most calls to list.add(i, i) don't need to re-create the backing array.

Only when the backing array becomes full, a larger array is created. And the larger array is 50% larger than the original array, so it can handle many subsequent calls to add before it becomes full.

Eran
  • 387,369
  • 54
  • 702
  • 768
3

Your code is calling copyOf() every time a new element is added:

  • The first time, no elements need copying because the original array's length is 0.
  • The second time, one element must be copied.
  • The third time, two elements.
  • The fourth time, three elements.
  • ...and so on.

So for every element that is added, you must du more and more work to copy the previous elements. So if you are adding n elements, you'll have to perform a total of 1 + 2 + 3 + ... + (n - 1) = n * (n - 1) / 2 = n^2 / 2 - n / 2 copyings of individual elements. So your runtime is porportional to the square of the number of elements you add.

Contrast this with the proper approach, which is to have a larger array than you need (which gives you headroom to add more elements without copying all the time), and to multiply the size by a fixed factor every time you need to expand. This requires that you separately keep track of how many elements you've added and lie to your users about your true size. The factor is usually less than 2 (the Java code uses 1.5: int newCapacity = oldCapacity + (oldCapacity >> 1);), but the math is simpler if we use 2:

  • The initial array size is some small but nonzero number, e.g. 4, so you can add 4 elements for free (well, for the cost of allocating and initializing the array, which in practice will be 4)
  • For the fifth element, double the size to 8 and copy the 4 old elements; you can now add 4 more
  • For the ninth element, double the size to 16 and copy the 8 old elements; you can now add 8 more
  • And so on: for the n + 1 th element, you double the size to 2 * n and copy the old n elements, which gives you space for n more elements.

Even without evaluating the sum of the copyings, we can see that every batch of n new elements has already been "paid for" by the copying of the previous n elements, so the copying work is linear instead of quadratic. And indeed, 4 + 4 + 8 + 16 + ... + n / 2 + n = 2 * n (if n is a power of 2).

Aasmund Eldhuset
  • 37,289
  • 4
  • 68
  • 81
0

Because grow() is not always called when a new element is added. The ArrayList is always increased by a factor of 50%.

The relevant lines are those:

int oldCapacity = elementData.length;
int newCapacity = oldCapacity + (oldCapacity >> 1);

The (oldCapacity >> 1) is doing a bit shift to the right, so dividing by 2. This means that newCapacity is +50% of oldCapacity. Only if this is exceeded, the next call to grow() is made.

Dorian Gray
  • 2,913
  • 1
  • 9
  • 25
0

Arrays.copyOf is as slow as expected. Whileas List.add() is as quick as expected, since Arrays.copyOf in the grow method doesn't occur every time the add method is called. grow only occurs when the capability of the ArrayList is not enough.

ZhaoGang
  • 4,491
  • 1
  • 27
  • 39
0

The difference is amortized runtime. Your implementation copies the entire array each time a new copy is created. Each copy-operation is O(n), making the cost of the entire operation

1 + 2 + ... + n = n * (n + 1) / 2 = O(n^2)

grow on the other hand increases the size in a different manner. The new size of the array is calculated in this line:

int newCapacity = oldCapacity + (oldCapacity >> 1);

which essentially boils down to

int newCapacity = (int) (oldCapacity * 1.5)

So once the array is filled up, the array grows by a factor of 1.5

This in comparison takes for n insertions and an initial size m

m + 1.5 * m + 1.5 ^ 2 * m + ... + n = 
m + 1.5 * m + 1.5 ^ 2 * m + ... + 1.5 ^ log_1.5(n / m) * m = 
m * (1 - 1.5 ^ (log_1.5(n / m) + 1) / (1 - 1.5) = 
m * 2 * (n / m * 1.5 - 1) =
3 * n - 2 * m

The amortized runtime can be calculated as the total cost for n insertions divided by n:

(3 * n - 2 * m) / n = 3 - 2 * m / n

Since m < n in your case the second part of the equation becomes negligible, and we we end up with O(1) as amortized runtime and a runtime for all insertions of O(n), which significantly differs from the O(n^2) for copying the entire array after each insert.