0

So a common programming practice for when an array can't hold the amount of information we want it to, is to double the size of an array. We do this by creating a new array, with double the original arrays size, populating this new array with the old values, and setting it as the old array. Here's a example in Java:

public int[] doubleArray(int[] old) {
    int[] doubled = new int[old.length * 2];
    for (int i = 0; i < old.length; i++) {
        doubled[i] = old[i];
    }
    return doubled;
}

So is it more better to use a standard array and double the size, when needed, or to simply use an ArrayList (which changes its own size based on the input you give it by 1)? Basically, is it better to double an arrays size when you need to store more than the array allows, or to increase its size by 1 each time? And what would be the limit to which one becomes a better choice than the other?

If I had to guess, using a standard array should be faster, given that it's in the java.lang, so I made the following class to test my theory:

public class ArrayTesting {
    public static void main(String[] args) {
    long startTime1 = System.nanoTime();
    int[] ints = new int[10];
    for (int i = 0; i < ints.length; i++) {
      ints[i] = i * 2;
    }
    long timeToSetOriginalValues = (System.nanoTime() - startTime1);
    long startTime2 = System.nanoTime();
    ints = doubleArray(ints);
    long timeToDouble = (System.nanoTime() - startTime2);
    long startTime3 = System.nanoTime();
    for (int i = 9; i < ints.length; i++) {
      ints[i] = i * 2;
    }
    long timeToSetNewValues = (System.nanoTime() - startTime3);
    System.out.println("Set Values in " + timeToSetOriginalValues);
    System.out.println("Doubled Array in " + timeToDouble);
    System.out.println("Finished setting values in " + timeToSetNewValues);
    System.out.println("Total time: " + (timeToSetOriginalValues + timeToDouble + timeToSetNewValues));
  }

  public static int[] doubleArray(int[] old) {
    int[] doubled = new int[old.length * 2];
    for (int i = 0; i < old.length; i++) {
      doubled[i] = old[i];
    }
    return doubled;
  }
}

But for an unknown reason, I am getting extremely varying results. Varying total time from everything between 11,000 and 4,000. Assuming it's something I did wrong, is there someone better at timing who could answer my question?

  • 2
    An `ArrayList` uses an `Array` as it's underlying data structure. It does the same thing when it runs out of room it increases the underlying `Arrays` size... – brso05 Nov 16 '15 at 14:02
  • 1
    Maybe take a look at [How not to write a microbenchmark](http://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java). – Andy Turner Nov 16 '15 at 14:05
  • What you can also do is to do a profiling. See: http://www.eclipse.org/tptp/home/documents/tutorials/profilingtool/profilingexample_32.html – Gordon Liang Nov 16 '15 at 14:06
  • @ brso05 Yes, but when it does so, it increases its own size by 1. So is it better to double the array, or add 1? Are there special situations (which is what I'm assuming) that either one is better than the other, and what would they be? I guess I'll edit the question to express this more. –  Nov 16 '15 at 14:07
  • 1
    @JaredMassa actually it doesn't increase by just 1... – brso05 Nov 16 '15 at 14:09
  • 2
    Are you familiar with amortized analysis? If not, try reading up about that; doubling of array size is a classic example in introductory texts. – Andy Turner Nov 16 '15 at 14:10
  • @brso05 I'm pretty sure it does.. if you read from the OpenJDK, look for the add() methods [Read it here](http://hg.openjdk.java.net/jdk7/jdk7/jdk/file/tip/src/share/classes/java/util/ArrayList.java) –  Nov 16 '15 at 14:12
  • @JaredMassa http://docs.oracle.com/javase/7/docs/api/java/util/ArrayList.html#ensureCapacity(int) – brso05 Nov 16 '15 at 14:14
  • There is a major difference between built in array in ArrayList, that the array can hold primitive values, but ArrayList can only hold objects. So if the purpose is to hold primitive values like int, it could make a little difference by using array or ArrayList from performance's point of view. But I am not sure how much difference it could be, may be just little. – Gordon Liang Nov 16 '15 at 14:15
  • @GordonLiang the implementation of using an array of integers in my test above is purely for testing. I could've made an array of File Objects if I wanted to, I just used integers for simplicity in testing. So I understand this would make little difference, but there should be a difference, and that's what I'm after. –  Nov 16 '15 at 14:17
  • 1
    It grows by 1.5x http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7u40-b43/java/util/ArrayList.java#236, at least in OpenJDK 7 and 8. However, the javadoc states that the exact growth strategy is unspecified, but guaranteed to have constant amortized time. – Andy Turner Nov 16 '15 at 14:17
  • @AndyTurner Which method are you reading? –  Nov 16 '15 at 14:20
  • 1
    Arrays when you have a known, fixed number of elements - ArrayList (or other collections) when you don't. The performance diference between manually handling the array or using an ArrayList is negligible, but the cleaner code matters – Jon Story Nov 16 '15 at 14:21
  • 1
    @JaredMassa the OpenJDK link you posted shows that it grows by 1.5. You have to follow all the way to grow(int). I don't see what you are bench marking though. Plus your sample size is way to small. – matt Nov 16 '15 at 14:21
  • @JaredMassa The one I linked, grow, which I found by tracing through from add. – Andy Turner Nov 16 '15 at 14:21
  • The Oracle JDK, grows by int newCapacity = oldCapacity + (oldCapacity >> 1); which is 1.5 times as well. – Gordon Liang Nov 16 '15 at 14:22
  • @GordonLiang oh jeez woops. Yes, 1.5 you're right. But this still shouldn't affect the question much. –  Nov 16 '15 at 14:34
  • What are you bench marking? You never use an ArrayList to compare. You create a 10 element array, then a 20 element array. Your bench mark is **way** too small, and it isn't addressing anything. What do you want to measure? – matt Nov 16 '15 at 14:42
  • @matt I never compared to an ArrayList because my array timing class gave radically different results per run. –  Nov 16 '15 at 15:45
  • @JaredMassa try doing your operations 1 million times. Then maybe you'll see a difference. You'll also have to check jvm options because after 10,000 times or so, it will be ''compiled''. You're current check is on the order of error, and I am actually surprised it takes so long. 3ms? – matt Nov 16 '15 at 18:32

3 Answers3

1

Right well I looked into it and here are the results:

I created a new classes for testing the time in editing an array, and the time in editing an ArrayList. Let's start with the ArrayTesting class.

public class ArrayTesting {
    private static long totalTotal = 0L;
    private static int[] ints = new int[10];
    public static void main(String[] args) {
        for (int i = 0; i < 5; i++) {
            runTest();
        }
        System.out.println("Final Length of Array: " + ints.length);
        System.out.println("Total Time was: " + totalTotal);
    }
    private static void runTest() {
        long startTime = System.nanoTime();
        for (int i = 0; i < ints.length; i++) {
            ints[i] = i * 2;
        }
        ints = doubleArray(ints);
        for (int i = 9; i < ints.length; i++) {
            ints[i] = i * 2;
        }
        long testTime = System.nanoTime() - startTime;
        totalTotal = totalTotal + testTime;
    }

    private static int[] doubleArray(int[] old) {
        int[] doubled = new int[old.length * 2];
        for (int i = 0; i < old.length; i++) {
            doubled[i] = old[i];
        }
        return doubled;
     }
}

The process for this class is as follows:

  1. Create new Array of Integer Objects (yes this matters), length 10.

  2. For five iterations, set all the indexes of the current array to index * 2, then double the array size, and fill the new indexes with their respective values of index * 2.

  3. Print results

Following the same process, I then created a testing class for an ArrayList:

import java.util.ArrayList;
class ArrayListTester {
    public static ArrayList<Integer> arl = new ArrayList<Integer>();
    public static long totalTotal = 0L;
    public static void main(String[] args) {
        //Set initial size for ArrayList to 10
        while(arl.size() < 10) arl.add(0);
        //Setting the size was not timed.
        for (int i = 0; i < 5; i++) {
            runTest();
        }
        System.out.println("Total ArrayList size: " + arl.size());
        System.out.println("Total Time: " + totalTotal);
    }
    public static void runTest() {
        long startTime1 = System.nanoTime();
        int initialSize = arl.size();
        for (int i = 0; i < initialSize * 2; i++) {
            if (i < initialSize)
                arl.set(i, ((Integer) i * 2));
            else
                arl.add((Integer) i * 2);
        }
        long totalForRun = System.nanoTime() - startTime1;
        totalTotal = totalTotal + totalForRun;
    }
}

If you read through this class, you will indeed find it follows the same steps, however using an ArrayList allows the code to be much more consise.

So now let's get to our results..

Since each class runs the doubling size thing for 5 iterations, our array size for both at the end is 320 or 10 * (2 ^ 5) (note the starting size for both arrays is 10). However, after a few quick runs of the test, the time consumption is drastically different.

Now, running the ArrayTester class 5 times in a row, and taking the average time for each run (adding it up and dividing by the number of runs) I recieved an average of 468,290 nanoseconds, some of you may be thinking this is a pretty decent chunk for simply doubling an array, but just you wait...

Next, moving to the ArrayListTester class and running it the same 5 times to gain an average, I received an average of exactly 2,069,230 nanoseconds.

If you'd like to test out these numbers for yourself, I ran these programs on an online compiler/interpreter here.

Final Result: It took the ArrayList almost 4.5 times longer to complete the same task.

Now going back to the original question asked: "So is it better to use a standard array and double the size, when needed, or to simply use an ArrayList?"

The answer truly deals with how efficient you want your code to be. For an ArrayList, efficiency literally doesn't exist anymore, however the aesthetics and organization of the code are dramatically increased. On the other hand, if you are looking for a more efficient method of handling things, and don't care as much about the code you need to write, then use an Array.

0

If you look at the ArrayList implementation, even it has the Array object, and when you call add method, it will also ensures the capacity and if needed, increments the size of the array like

elementData = Arrays.copyOf(elementData, newCapacity); 

Where elementData is the internal array to store the data.

Regarding Performance or memory consumption, since it is a wrapper with many functionality, obviously there will be some cost compare to the Array object.

rajuGT
  • 6,224
  • 2
  • 26
  • 44
0

An simple Array will always be faster, because ArrayList is an Array that creates a new Array when it is out of space and shrinks it when it is to big.

public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

Code is taken from the OpenJDK7 here you can clearly see, the small overhead a simply add method would have, which is checking the size of the array if it is to small or to big, copy the array to an array with a bigger size and then set the element. An normal Array would just have.

Arr[index] = value;

But the increased functionality of ArrayLists and the improvements of your code quality, will in most non performance optimised scenarios be far superior, to an traditional Array.

As research you can find examples of very well implemented resizing Arrays at http://algs4.cs.princeton.edu/home/, with a example of a resizing Array http://algs4.cs.princeton.edu/13stacks/ResizingArrayStack.java.html.

mrhn
  • 17,961
  • 4
  • 27
  • 46
  • 1
    Actually a simple array will be roughly comparable to an ArrayList (or close enough that you shouldn't care on a modern system)... the only way an Array is better is if you can optimise the growth strategy. Either way, arrays should be used for a fixed size only, cleaner code is more important than the tiny performance difference. – Jon Story Nov 16 '15 at 14:22
  • 1
    "But the increased functionality of ArrayLists and the improvements of your code quality, will in most non performance optimised scenarios be far superior, to an traditional Array." – mrhn Nov 16 '15 at 14:23
  • This is a good point. When an application, most of the time, does not need to check the boundary, using an array is a very clean and light way if the performance is critical. Then the application will grow the capacity in a central point when it knows it need to grow. HOWEVER, if the performance is that critical, there are also a bunch of other libraries or even the language itself, could be improved or considered. As a result, using Array or ArrayList may not be a key point. – Gordon Liang Nov 16 '15 at 14:24
  • 1
    Indeed, I was more pointing out the specific time you should use a fixed size array regardless – Jon Story Nov 16 '15 at 14:24