3

I am trying to analyze the complexity of adding an item to a simple data structure (an array of variable size). I do this by means of a doubling ratio experiment, i.e. I am doubling the array size in each iteration of the experiment, before then measuring the time it takes to add a single entry to the array.

In my implementation, adding a number to the array requires reallocation of all existing array entries. Because of that, one would think that the execution time should always increase as the number of items in the array increases.

Weirdly enough, this is not the case.

Here's a complete, minimal example:

public class Experiment {
    int size = 0;
    int[] arr = new int[0];

    public static void main(String[] args) throws Exception {
        int minN = 125;
        int maxN = 128000;
        long start;
        long duration;

        Experiment experiment = new Experiment();

        for(int n=minN; n<=maxN; n+=n) {
            while(experiment.size < n) experiment.add(111);

            start = System.nanoTime();
            experiment.add(111);
            duration = System.nanoTime() - start;

            System.out.println("N: " + n + "\t Execution time: " + duration);
        }
    }

    public void add(int number) throws Exception {
        // Init temporary array with size + 1
        this.size++;
        int[] tmpArr = new int[this.size];

        // Reallocate existing entries
        for(int i=0; i<this.arr.length; i++) tmpArr[i] = this.arr[i];

        // Add new number
        tmpArr[this.size - 1] = number;

        // Replace with temporary array
        this.arr = tmpArr;
    }
}

When executing the experiment, this is the result I get:

N: 125       Execution time: 1799
N: 250       Execution time: 624
N: 500       Execution time: 1133
N: 1000      Execution time: 2161
N: 2000      Execution time: 566
N: 4000      Execution time: 1270
N: 8000      Execution time: 3195
N: 16000     Execution time: 6655
N: 32000     Execution time: 12155
N: 64000     Execution time: 56298
N: 128000    Execution time: 61612

I executed the experiment many times, and the following pattern always holds true: The first iteration takes much longer than the second. In the third and fourth iteration, execution time increases again. In the fifth, it decreases again, before then continuously increasing.

And I noticed, that if I start the experiment in the fifth iteration (by setting minN = 2000), it behaves as expected, i.e. it continuously increases. But even then, the ratios between iterations is inconsistent, which it shouldn't be (sometimes execution time doubles from one iteration to the next, and at other times it almost quadruples).

Does anyone have ideas why this is the case? Shouldn't it continously increase? Thanks.

ksbg
  • 3,214
  • 1
  • 22
  • 35

0 Answers0