4

Are there any performance benefits if i call ArrayList.ensureCapacity() method. In what case it is advisable to make use of the method?

Buhake Sindi
  • 87,898
  • 29
  • 167
  • 228
Sameer Patil
  • 526
  • 4
  • 13

1 Answers1

6

The performance benefit is realised in cases where you are about to add multiple elements to the list and know how many you're going to be adding. By calling ensureCapacity(int) you cause the underlying array to be resized once instead of potentially many times.

Note however, that in reality you should rarely need to call this method; typically you will either instantiate the ArrayList with a known capacity, or in cases where the list size is unknown you should probably be considering using a LinkedList instead.

Also note that the resize strategy of ArrayList is typically implemented in such a way that array copies are a rare operation (e.g. the capacity may increase by 50% every time the array becomes full). In other words, even if you do not call ensureCapacity in advance you are unlikely to notice any slow-down within your code.

Adamski
  • 54,009
  • 15
  • 113
  • 152
  • 2
    You probably meant "or in cases where the list size in **un**known". I don't think it's a sufficient reason to prefer LinkedList over ArrayList, though. ArrayList is almost always faster, except when the list is large and you have frequent insertions/removals to/from the start of the list or inside an iteration. – JB Nizet Oct 31 '11 at 11:34
  • LinkedList under almost any circumstance is a terrible idea, it wastes more memory (way more) than ArrayList, if imposes extra indirection (cache-miss), making LinkedList the worst possible datastructure to iterate. The only benefit is inserting in the middle during iteration. The benefit of ensureCapacity() can be noticed with the argument in the millions range and starting off with vanilla arraylist. – bestsss Oct 31 '11 at 14:09
  • @JBNizet, if the remove is during the iteration, the iteration itself on a linkedlist is a hefty price to pay. – bestsss Oct 31 '11 at 14:10
  • 1
    @bestnsss: That's a bit of a generalisation don't you think? Suppose I'm programming an embedded system where memory is a concern. I create an ArrayList and add 10M. Then I remove 9.9M elements; at this point the underlying array is still of length 10M. Also, supposing I want a predictable insertion time? Having to wait for a full array copy whilst the list resizes could throw up all manner of performance problems in some situations, the point being that the worst case insertion time is O(n) rather than O(1). – Adamski Oct 31 '11 at 14:19
  • 1
    @bestsss: that's not what I observe in my micro-benchmark. See http://pastebin.com/mSWCzYhW. The removal of every even element of a LinkedList with 10000 elements is 2 orders of magnitude faster than the same on an ArrayList. Just iterating over the list is indeed much quicker with an ArrayList, though. (JDK7) – JB Nizet Oct 31 '11 at 14:27
  • @JBNizet, indeed writing a benchmark is hard. The benchmark is unrealistic, if you need such type of removal you'd be better of w/ something like B-tree. Also this one lives perfectly in the L2 cache... Bad benchmark is a bad benchmark. – bestsss Oct 31 '11 at 14:54
  • @JBNizet, I wish to see a half-real-world application that removes every second element of a list. In that case, the traversal is paid once only but the effect is indeed great, also each node (during remove) is already in the cache, so it minimizes the cache-miss to unrealistically high level. – bestsss Oct 31 '11 at 14:58
  • @apology to Adamski for hijacking... continued: There is no contest that removing via iterator will always be in favor of a linked structure but most application have a lot more reads than writes and it's the reads the govern the overall performance. Since access (not only random) is a lot slower for the linked structures, not too many writes will not damage the performance. If you need some linked structure, at least linked Object[] w/ 8-10 elements (just enough to fit the cache line) and 0,1 reserved for prev/next is the way to go. Perhaps linked list should be reimlpemented like that. – bestsss Oct 31 '11 at 15:08
  • @bestss: You state that Object[] is big enough to fit in a cache line but these are object references not the actual data; i.e. there's still indirection due to pointer dereferencing because all Java objects live on the heap. Sure, iteration is fast if you iterate over an array of references and don't actually inspect the objects they point to, but it would be a fairly pointless exercise. – Adamski Oct 31 '11 at 16:09
  • @bestsss: A benchmark is only a benchmark, and of course it's unrealistic for most of the usages of a List. But it's also unrealistic to say that a LinkedList should never be used. I agree with you: most of the time, an ArrayList is better. That's what I said in my first comment. But LinkedList has its good uses, else it wouldn't exist. – JB Nizet Oct 31 '11 at 16:10
  • @JBNizet, at the moment I cannot think of a single case where I'd use LinkedList – bestsss Oct 31 '11 at 16:32
  • @bestsss: Example of when I'd use a LinkedList: As a Queue implementation (for example, performing a breadth-first search). Lots of good discussion points here: http://stackoverflow.com/questions/322715/when-to-use-linkedlist-over-arraylist. – Adamski Oct 31 '11 at 16:40
  • @Adamski, you still get significantly less cache-misses. 2 times cache-misses is basically 2 times better performance and that includes that all object are necessary to be examined. Checking class may not need extra load as it is part of the reference (usually). After all, speculative execution does that: attempts to run 2 cache-misses in the same time. The linked Object[] can allow much faster random access as well. – bestsss Oct 31 '11 at 16:41
  • @Adamski, queues are a lot better off w/ ringed buffer, ArrayDeque for instance. LinkedList loses always outside lock-free and even then bounded queues are better most of the time. – bestsss Oct 31 '11 at 16:42
  • @bestsss: I am well aware of when to use ArrayDeque and other blocking / non-blocking queue implementations; my point was to illustrate a *potential* situation where I'd consider using a LinkedList. I agree with you that ArrayList is the better choice most of the time but I think it's naive to exclude ever using LinkedList; the data structure has been implemented in practically every procedural language, and with good reason. – Adamski Oct 31 '11 at 16:59