Array's lookup time complexity vs. how it is stored

Question

It's well known that the time complexity of array access by index is O(1).

The documentation of Java's ArrayList, which is backed by an array, says the same about its get operation:

The size, isEmpty, get, set, iterator, and listIterator operations run in constant time.

The lookup is done by getting the memory address of the element at a given index independently of the array's size (something like start_address + element_size * index). My understanding is that the array's elements have to be stored next to each other in the memory for this lookup mechanism to be possible.

However, from this question, I understand that arrays in Java are not guaranteed to have their elements stored contiguously in the memory. If that is the case, how could it always be O(1)?

Edit: I'm quite aware of how an ArrayList works. My point is, if contiguous storage for arrays is not guaranteed by the JVM specification, its elements could be at different areas in the memory. Although that situation is highly unlikely, it would render the lookup mechanism mentioned above impossible, and the JVM would have another way to do the lookup, which shouldn't be O(1) anymore. At that point, it would be against both the common knowledge stated at the top of this question and the documentation of ArrayList regarding its get operation.

Thanks everybody for your answers.

Edit: In the end, I think it's a JVM-specific thing but most, if not all, JVM's stick to contiguous storage of an array's elements even when there's no guarantee, so that the lookup mechanism above can be used. It's simple, efficient and cost-effective.

As far as I can see, it would be silly to store the elements all over the place and then have to take a different approach to doing the lookup.

Probably one can read that List-O(1) in the sense of "O(good enough most of the time)". Meaning: for 99% of your usecases / real world scenarions it is O(1). — GhostCat, Nov 01 '16 at 10:03

score 5 · Accepted Answer · answered Nov 01 '16 at 10:22

5

As far as I'm aware, the spec gives no guarantee that arrays will be stored contiguously. I'd speculate that most JVM implementations will though. In the basic case it's simple enough to enforce: if you can't expand the array because other memory is occupying the space you need, move the whole thing somewhere else.

Your confusion stems from a misunderstanding of the meaning of O(1). O(1) does not mean that a function is performed in a single operation (e.g. start_address + element_size * index). It means that the operation is performed in a constant amount of time irrespective of the size of the input data - in this case, the array. This is perfectly achievable with data that is not stored contiguously. You could have a lookup table mapping indexes to memory locations, for example.

answered Nov 01 '16 at 10:22

Michael

41,989
11
82
128

Yeah, I also guess that most JVM implementations do it that way even without a guarantee. – Luciano Vercetti Nov 02 '16 at 02:47
I'm aware that O(1) is about the constant amount of time an operation takes. If two algorithms are both O(1), it doesn't mean that they take the same amount of time to do an operation. – Luciano Vercetti Nov 02 '16 at 03:10
I just find it strange that if an array's elements are not stored contiguously, the JVM will have to take _another_ way to do the lookup (e.g. the lookup table you mentioned), and that way couldn't be as simple, efficient and cost-effective as the "traditional" one (`start_address + element_size * index`) even if it might also be O(1). I mean it's not worth it to _not_ do contiguous storage and then face that consequence. Thanks anyway. You have my upvote :-) – Luciano Vercetti Nov 02 '16 at 03:10

score 1 · Answer 2 · answered Nov 01 '16 at 10:10

From the linked question you can see that even though it's not mandated by the JVM rules, it's highly likely that 1D arrays are continuous in memory.

Given a contiguous array the time complexities of ArrayList are as given. However it's not impossible that in a special case or a special JVM the complexities might be slightly different. It's impossible to provide the time complexities if you have to consider all kinds of VMs that are allowed by the spec.

score 1 · Answer 3 · answered Nov 01 '16 at 10:15

Everytime, an element is added, its capacity is checked: http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/ArrayList.java#ArrayList.add%28java.lang.Object%29

public boolean add(E e) {
   ensureCapacity(size + 1);  // Increments modCount!!
   elementData[size++] = e;
   return true;
}

Here, ensureCapacity() does the trick to keep the array sequential. How? http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b27/java/util/ArrayList.java#ArrayList.ensureCapacity%28int%29

public void ensureCapacity(int minCapacity) {
        modCount++;
        int oldCapacity = elementData.length;
        if (minCapacity > oldCapacity) {
           Object oldData[] = elementData;
        int newCapacity = (oldCapacity * 3)/2 + 1;
        if (newCapacity < minCapacity)
           newCapacity = minCapacity;
        // minCapacity is usually close to size, so this is a win:
        elementData = Arrays.copyOf(elementData, newCapacity);
    }
}

Thus, in every stage it tries to ensure that the array has enough capacity and is linear i.e. any index within the range can be retrieved in O(1).

Joop Eggen · Answer 4 · 2016-11-01T10:16:37.110

An ArrayList wraps a real array. (On add it might need to grow.) So it has the same complexity for get and set, O(1).

However the ArrayList can (till some future version of Java) only contain Objects. For primitive types like int or char wrapper classes are needed that are inefficient, and have the objects chaotically divided over the entire allocated memory. Still O(1) but with a large constant factor.

So for primitive types you might use arrays and do the growing yourself:

    elementData = Arrays.copyOf(elementData, newCapacity);

Or if that would fit, use Bitset where the indices with true are the values.

Array's lookup time complexity vs. how it is stored

4 Answers4

Related