4

One interview question which I couldn't answer and couldn't find any relevant answers online.

Suppose in an arraylist, there are 10000 data, and I want to find the number which is currently on 5000th index, how does the arraylist know the indexes and give result in constant time?

Because if we are traversing through the arraylist to find the data, it would take linear time and not constant time.

Thanks in advance.

neil
  • 149
  • 1
  • 12

4 Answers4

7

The storage backing an ArrayList is an array. Whether primitive values or object references are stored, all objects in the array are in consecutive order in memory.

For array access, all the compiler has to do is have instructions that calculate the correct memory address based on the initial address and which index is desired, which is O(1). Then it can go directly to that calculated address. There is no traversing, so it is not O(n).

rgettman
  • 176,041
  • 30
  • 275
  • 357
  • Suppose we do not know that the element is at 5000th location, we just have to search for a particular value( for eg integer 3 which happens to be on the 5000th index), for searching the value, we will have to traverse through the arraylist to find the value and it would take linear time right?? – neil Aug 28 '18 at 01:00
  • @neil No need to post that follow-up under every answer. You could comment on your question and `@` us. Or even better: Post a new question, possibly linking this as a reference. – Zabuzard Aug 28 '18 at 01:38
  • thanks a lot @Zabuza – neil Aug 28 '18 at 01:39
1

ArrayLists can be thought of as an array of Objects (Which happens to be exactly how they are implemented). You can index into it as any other array at O(1). The advantage over a true "Array" is that it tracks a "Length" independent of the array's length and automatically extends the array when it "overflows"--plus a few extra operations.

LinkedLists (probably the structure you are thinking of) require you to walk from one item to the next, so the implementation is O(n) to find an item at an index.

Bill K
  • 62,186
  • 18
  • 105
  • 157
  • Suppose we do not know that the element is at 5000th location, we just have to search for a particular value( for eg integer 3 which happens to be on the 5000th index), for searching the value, we will have to traverse through the arraylist to find the value and it would take linear time right?? – neil Aug 28 '18 at 01:00
  • That was not the question, but yes it is correct. In that case you would not use an arraylist, you would use a hashmap where the key is the integer and the value is the index. Knowing what collection to use in what situation is extremely important. The key phrase from the question was: "I want to find the number which is currently on 5000th index" – Bill K Aug 28 '18 at 16:42
1

ArrayList

ArrayList uses an array under the hood, thus the name. Arrays are data-structures with a direct, fast, index-based access.

So if you ask for the element at index 5 000, it just asks its internal array:

// More or less
return array[5000];

Here's the full method from OpenJDK 8:

/**
 * Returns the element at the specified position in this list.
 *
 * @param  index index of the element to return
 * @return the element at the specified position in this list
 * @throws IndexOutOfBoundsException {@inheritDoc}
 */
public E get(int index) {
    rangeCheck(index);
    return elementData(index);
}

In particular, it does not traverse all elements up to that point. That's what other data-structures, without index-based access, need to do. Such as LinkedList. Note that there is an indicator interface, called RandomAccess (documentation). Classes implementing that interface have a direct index-based access. Current implementations are:

ArrayList, AttributeList, CopyOnWriteArrayList,
RoleList, RoleUnresolvedList, Stack, Vector

How arrays work

So, how does an array have direct access to that element? Well, arrays are of fixed size. When you create it, you need to tell it the size. For example 10 000:

Foo[] array = new Foo[10000];

Your computer will then allocate contiguous memory for 10 000 objects of Foo. The key is that the memory area is contiguous, not scattered around. So the third element comes directly after the second and directly before the fourth element in your memory.

When you now want to retrieve the element at position 5 000, your computer retrieves the Foo object at the following memory position:

startAddressOfArray + 5000 * sizeOfFoo

Everything is known since declaration of the array and the computation is fast, obviously in constant time O(1). Thus, arrays have direct index-based access to their elements. Because the stuff is stored together, contiguously in memory.

You may read more about arrays on Wikipedia.

Here is an image from techcrashcourse.com showing an array with the addresses of each element:

enter image description here

The array is of size 7 and stores integers that use 2 bytes (16 bits). Usually called short, so a new short[7] array. You can see that each element is offset by 2 bytes (the size of a short) to its previous element. Which makes it possible to access an element at a given position directly with a simple computation, as shown.

Zabuzard
  • 25,064
  • 8
  • 58
  • 82
  • Suppose we do not know that the element is at 5000th location, we just have to search for a particular value( for eg integer 3 which happens to be on the 5000th index), for searching the value, we will have to traverse through the arraylist to find the value and it would take linear time right?? – neil Aug 28 '18 at 01:00
  • @neil Exactly. That's not what an `ArrayList` (or `List`s more general) is designed for. Typically you would use a hash-based structure, like a `HashSet` for value-based access in `O(1)`. – Zabuzard Aug 28 '18 at 01:36
  • @Zabuzard `startAddressOfArray + 5000 * sizeOfFoo` or `startAddressOfArray + 5000 * sizeOfReference` ? As size of `Foo` can be different for different instances of `Foo` it is not possible I think, instead the reference size is fixed, either 4 bytes or 8 bytes depending on whether the machine is 32bit or 64bit – Suren Aznauryan Nov 03 '20 at 14:24
0

As its name suggests, ArrayList stores elements in an array. Here is the relevant piece of code in oracle JDK :

/**
 * The array buffer into which the elements of the ArrayList are stored.
 * The capacity of the ArrayList is the length of this array buffer. Any
 * empty ArrayList with elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA
 * will be expanded to DEFAULT_CAPACITY when the first element is added.
 */
transient Object[] elementData; // non-private to simplify nested class access

Thus, without surprise, list.get(index) only gets the nth element in the internal array :

public E get(int index) {
    Objects.checkIndex(index, size);
    return elementData(index);
}

E elementData(int index) {
    return (E) elementData[index];
}
Arnaud Denoyelle
  • 29,980
  • 16
  • 92
  • 148
  • Suppose we do not know that the element is at 5000th location, we just have to search for a particular value( for eg integer 3 which happens to be on the 5000th index), for searching the value, we will have to traverse through the arraylist to find the value and it would take linear time right?? – neil Aug 28 '18 at 01:01
  • @neil Yes. And if you regularly need to search for values and do not care about the order of the element, you can sort the Vector once (this takes O(n log(n)) time). This will then allow you to search element in logarithmic time. – Arnaud Denoyelle Aug 28 '18 at 12:06