125

Reading the Java documentation for the ADT List it says:

The List interface provides four methods for positional (indexed) access to list elements. Lists (like Java arrays) are zero based. Note that these operations may execute in time proportional to the index value for some implementations (the LinkedList class, for example). Thus, iterating over the elements in a list is typically preferable to indexing through it if the caller does not know the implementation.

What exactly does this mean? I don't understand the conclusion which is drawn.

skaffman
  • 398,947
  • 96
  • 818
  • 769
nomel7
  • 1,523
  • 3
  • 12
  • 20
  • 12
    Another example that may help you understand the general case of this is [Joel Spolsky's article "Back to Basics"](http://www.joelonsoftware.com/articles/fog0000000319.html) - search for "Shlemiel the painter's algorithm". – user May 08 '12 at 08:03

5 Answers5

210

In a linked list, each element has a pointer to the next element:

head -> item1 -> item2 -> item3 -> etc.

To access item3, you can see clearly that you need to walk from the head through every node until you reach item3, since you cannot jump directly.

Thus, if I wanted to print the value of each element, if I write this:

for(int i = 0; i < 4; i++) {
    System.out.println(list.get(i));
}

what happens is this:

head -> print head
head -> item1 -> print item1
head -> item1 -> item2 -> print item2
head -> item1 -> item2 -> item3 print item3

This is horribly inefficient because every time you are indexing it restarts from the beginning of the list and goes through every item. This means that your complexity is effectively O(N^2) just to traverse the list!

If instead I did this:

for(String s: list) {
    System.out.println(s);
}

then what happens is this:

head -> print head -> item1 -> print item1 -> item2 -> print item2 etc.

all in a single traversal, which is O(N).

Now, going to the other implementation of List which is ArrayList, that one is backed by a simple array. In that case both of the above traversals are equivalent, since an array is contiguous so it allows random jumps to arbitrary positions.

Soner Gönül
  • 97,193
  • 102
  • 206
  • 364
Tudor
  • 61,523
  • 12
  • 102
  • 142
  • 29
    Minor notice: LinkedList will search from the end of the list if the index is in the later half of the list, but that doesn't really change the fundamental inefficiency. It makes it only slightly less problematic. – Joachim Sauer May 07 '12 at 20:40
  • Out of curiosity, does anyone know if this is the same for C# (or other similar languages)? – Dan Diplo May 08 '12 at 08:39
  • 8
    *This is horribly inefficient*. For larger LinkedList -yes, for smaller it can work faster `REVERSE_THRESHOLD` is set 18 in `java.util.Collections`, it's weird to see so upvoted answer without the remark. – bestsss May 08 '12 at 09:28
  • 1
    @DanDiplo, if the structure is Linked, yes it holds true. Using LinkedS structures, however, is a small mystery. They almost always perform far worse than array backed ones (extra memory footprint, gc-unfriendly and terrible locality). Standard list in C# is array backed. – bestsss May 08 '12 at 09:30
  • @bestsss: When I said "horribly inefficient" I was making a comment about asymptotic performance, hence my O(N^2) vs O(N) analysis. Of course, for small data sets many such reasonings do not apply, since also for N < 8 insertion sort is faster than mergesort or quicksort. – Tudor May 08 '12 at 09:48
  • sure, about n^2, no doubt (even though LinkedList is doubly-linked, it's still n^2). However, for a LinkedList it might prove useful to keep the last 1-3 indexes cached. Then a full loop via get(prev+1)/get(prev-1) would yeild O(n) [wow!]. Sure the cache is to be invalidated on each modification but it's already taken care via `modCount`. The change is trivial and it can help poorly written code w/ very minor performance degrade of get(index) which perform poorly to boot. – bestsss May 08 '12 at 09:57
  • 3
    Minor notice: If you want to check which iteration type should be used (indexed vs Iterator/foreach), you can always test if a List implements [RandomAccess](http://docs.oracle.com/javase/6/docs/api/java/util/RandomAccess.html) (a marker interface): `List l = unknownList(); if (l instanceof RandomAccess) /* do indexed loop */ else /* use iterator/foreach */` – Tadas S May 08 '12 at 19:17
  • @Tudor **+1** for nice explanation I have one question .. Is
    for(String s: list) {
        System.out.println(s);
    }
    and
    List list = new ArrayList();
         for (Iterator iterator = list.iterator(); iterator.hasNext();) 
         {
      String string = (String) iterator.next();
      }
    **are equally efficient ??** if not then which is more **efficient**
    – KK_07k11A0585 May 09 '12 at 04:25
  • 1
    @KK_07k11A0585: Actually the enhanced for loop in your first example is compiled into an iterator like in the second example, so they are equivalent. – Tudor May 09 '12 at 07:30
35

The answer is implied here:

Note that these operations may execute in time proportional to the index value for some implementations (the LinkedList class, for example)

A linked list doesn't have an inherent index; calling .get(x) will require the list implementation to find the first entry and call .next() x-1 times (for a O(n) or linear time access), where an array-backed list can just index into backingarray[x] in O(1) or constant time.

If you look at the JavaDoc for LinkedList, you'll see the comment

All of the operations perform as could be expected for a doubly-linked list. Operations that index into the list will traverse the list from the beginning or the end, whichever is closer to the specified index.

whereas JavaDoc for ArrayList has the corresponding

Resizable-array implementation of the List interface. Implements all optional list operations, and permits all elements, including null. In addition to implementing the List interface, this class provides methods to manipulate the size of the array that is used internally to store the list. (This class is roughly equivalent to Vector, except that it is unsynchronized.)

The size, isEmpty, get, set, iterator, and listIterator operations run in constant time. The add operation runs in amortized constant time, that is, adding n elements requires O(n) time. All of the other operations run in linear time (roughly speaking). The constant factor is low compared to that for the LinkedList implementation.

A related question titled "Big-O Summary for Java Collections Framework" has an answer pointing to this resource, "Java Collections JDK6" which you might find helpful.

Community
  • 1
  • 1
andersoj
  • 22,406
  • 7
  • 62
  • 73
8

While the accepted answer is most certainly correct, might I point out a minor flaw. Quoting Tudor :

Now, going to the other implementation of List which is ArrayList, that one is backed by a simple array. In that case both of the above traversals are equivalent, since an array is contiguous so it allows random jumps to arbitrary positions.

This is not completely true. The truth is, that

With an ArrayList, a hand-written counted loop is about 3x faster

source: Designing for Performance, Google's Android doc

Note that the handwritten loop refers to the indexed iteration. I suspect its because of the iterator which is used with enhanced for loops. It produces a minor performance in penalty in a structure which is backed by a contiguous array. I also suspect this might be true for the Vector class.

My rule is, use the enhanced for loop whenever possible, and if you really care about performance, use indexed iteration only for either ArrayLists or Vectors. In most cases, you can even ignore this- the compiler might be optimizing this in the background.

I merely want to point out that in the context of development in Android, both the traversals of ArrayLists are not necessarily equivalent. Just food for thought.

Dhruv Gairola
  • 9,102
  • 5
  • 39
  • 43
  • Your source is Anndroid only. Does this hold for other JVMs as well? – Matsemann May 10 '12 at 23:29
  • Not completely sure tbh, but again, using enhanced for loops should be the default in most cases. – Dhruv Gairola May 25 '12 at 11:30
  • It makes sense to me, getting rid of all the iterator logic when accessing a data structure that uses an array works faster. I don't know if 3x faster, but certainly faster. – Setzer22 Aug 13 '13 at 15:19
8

Iterating over a list with an offset for the lookup, such as i, is analogous to Shlemiel the painter's algorithm.

Shlemiel gets a job as a street painter, painting the dotted lines down the middle of the road. On the first day he takes a can of paint out to the road and finishes 300 yards of the road. "That's pretty good!" says his boss, "you're a fast worker!" and pays him a kopeck.

The next day Shlemiel only gets 150 yards done. "Well, that's not nearly as good as yesterday, but you're still a fast worker. 150 yards is respectable," and pays him a kopeck.

The next day Shlemiel paints 30 yards of the road. "Only 30!" shouts his boss. "That's unacceptable! On the first day you did ten times that much work! What's going on?"

"I can't help it," says Shlemiel. "Every day I get farther and farther away from the paint can!"

Source.

This little story may make it easier to understand what is going on internally and why it is so inefficient.

alex
  • 479,566
  • 201
  • 878
  • 984
4

To find the i-th element of a LinkedList the implementation goes through all elements up to the i-th.

So

for(int i = 0; i < list.length ; i++ ) {
    Object something = list.get(i); //Slow for LinkedList
}
esej
  • 3,059
  • 1
  • 18
  • 22