0

Lets assume I call a third party API and get back a mutable N-many list of objects. This list could be as small as 10 objects or as large as a few thousand. I then always want to insert just one object at index 0 of that returned List. I know i can easily call add at index 0 but this is going to be O(n) as it shifts every object for the insert. My question is, would it be faster on average (processing wise) to create a new List with the item i plan on inserting at the beginning and then call addAll on that new List passing in the returned 3rd party N-many List?

Justin A
  • 364
  • 4
  • 14
  • 3
    What about trying both ways, then benchmarking it and showing us the results? – Matej Kormuth Sep 19 '16 at 15:19
  • Have you tried measuring this in your specific conditions? – pvg Sep 19 '16 at 15:20
  • Do you expect `addAll` to somehow be faster than `O(n)`? – bradimus Sep 19 '16 at 15:20
  • 3
    This sounds like the kind of nano-optimization that will not matter. Maybe you should use LinkedList instead of ArrayList if you know you want to add at the head of the list.; – duffymo Sep 19 '16 at 15:23
  • You can try doing both and try to benchmark it, but due to many optmisation, I'm not sure this would be really meaningfull. Also, I'm not really sure the difference would be that great, even for only a few thousand. However, I'd rather have a standard list than a custom one from an API. But it all depend on what ou wanna do with it. – Asoub Sep 19 '16 at 15:24
  • @duffymo yes LL insert is O(1), but OP has no control over impl from 3rd party and creating a LL from an AL in O(n) – Bohemian Sep 19 '16 at 15:26
  • He has complete control over the List that he loads that data into. – duffymo Sep 19 '16 at 15:27
  • 1
    `addAll()` on an ArrayList will require 2 array copies. So, just inserting it at the beginning should be faster for any kind of list. – Codebender Sep 19 '16 at 15:28
  • @duffymo no. See first sentence. He gets whatever the library returns (which could already be a LL btw - unknown). – Bohemian Sep 19 '16 at 15:35
  • Yes. He can wrap it into a LL if he wishes. Still won't matter. Nano-optimization. The network roundtrip to get the database from a service will be far longer. – duffymo Sep 19 '16 at 15:37
  • @duffymo I know this may seem like a nano-optimization especially at small scale but i became currious at larger scale. The implicit operations that take place and if one was optimised better than the other. I was thinking that [https://en.wikipedia.org/wiki/Branch_predictor](Branch Prediction) may be a helpful factor here as argued for [http://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array?rq=1](sorrted array processing). – Justin A Sep 19 '16 at 15:41
  • 1
    @duffyno LL doesn't "wrap": `LinkedList(Collection)` constructor calls `addAll()`, so it's O(n) – Bohemian Sep 19 '16 at 15:42
  • Good point @Bohemian. I'll vote your answer up. – duffymo Sep 19 '16 at 15:45
  • @MatejKormuth Sure I could try under various loads just seeing how many ms it takes but theres always big debate of how to properly run some performance testing. What would be your best approach for this situation? – Justin A Sep 19 '16 at 15:50
  • @justin an array copy of 1000 or so is pretty fast. The insertion will take way less than 1ms - my guess a few microseconds. If the library is doing any kind of I/O, the insertion will be immeasurably insignificant in the overall scheme of things. – Bohemian Sep 19 '16 at 15:56

2 Answers2

6

It depends on the list implementation. If you truly have no visibility of what list implementation your third-party has given you, all you can do is empirical testing and benchmarking.

More likely, they're returning you one of the standard Java list types, and indeed you've tagged your question arraylist -- is that what you're given?

ArrayList.add(index,element) uses System.arrayCopy() to copy each shifted element from index n to n+1, then writes the new element to its slot. That's O(n), however it's likely to be very fast indeed, since it will use the highly optimised system memmove routine to move whole chunks of RAM at a time. (see Why are memcpy() and memmove() faster than pointer increments? ).

In addition, if your extra element nudges the size of the list past the size of the allocated backing array, Java will create a new array and arraycopy the whole lot into there.

Bear in mind that you're only copying object references, not whole objects, so for 1000 elements, you're copying (worst case on a 64 bit machine) 64 bits * 1000 == 8 kilobytes of RAM.

Still, for really huge lists, the time it takes might become significant. Inserting into a linked list is cheaper (should be O(1) at the start or end)

You can make it an O(1) operation on an arbitrary List implementation by writing/finding a List implementation that is just a wrapper around the existing list. For example:

public class HeadedList<T> extends AbstractList<T> {

    private final List tail;

    public HeadedList(T head, List tail) {
       this.head = head;
       this.tail = tail;
    }

    public T get(int index) {
        return index == 0 ? head : tail.get(index - 1);
    }

    public int size() {
        return tail.size() + 1;
    }
}

(NB if you work in languages like Lisp/Clojure/etc you get very used to thinking of lists in this way)

But, only bother with this if benchmarking reveals that real performance problems are being caused by list building.

slim
  • 40,215
  • 13
  • 94
  • 127
  • 2
    I tagged it as ArrayList because that was what i was going to create and add the returned list too. I should correct the tag to be just List. I wasn't thinking about using a LinkedList here. Thanks for the really good answer. This was the type of answer I was looking for. I know most say "dont worry about performance in Java" but there are times in places where it is valuable to understand these things especially if its a pattern that is going to be reused when problems are found in the future. Ill upvote and make as the answer. – Justin A Sep 19 '16 at 16:26
  • What happens if I iterate over a HeadedList? – Bohemian Sep 19 '16 at 18:50
  • @Bohemian I left out some of the implementation (the ”...") but if you write it right, its Iterator should work fine. – slim Sep 19 '16 at 20:17
  • @Bohemian (I know a lot of time has passed, but I got an upvote and re-read) -- I changed the example to `extends AbstractList` instead of `implements List` - this way you get the rest of the implementation for free, and iteration etc. just works. – slim Mar 16 '20 at 09:56
2

If the returned List impl is ArrayList, both options are the same: O(n).
If the returned impl is LinkedList, inserting at head is O(1).

There is an always O(1) option: Create a List wrapper class that is backed by the returned list but allows insertion at head by storing the inserted element internally. You would have to create a custom iterator to iterate over the inserted elemdnt then delegate to the list. Most methods would need similar customisation.

If it's only a 1000 or so elements I wouldn't bother, unless your application is complete and you've determined there is a measurable and severe enough performance problem at this operation.

If you were inserting multiple elements at head, then you would take a hit once to create a LinkedList, then each insertion would be O(1), but since you only have 1 to insert, don't bother.

KISS: Just insert the element into the returned list. I'm sure it will be faster enough, and most likely way faster than the library anyway.

Bohemian
  • 412,405
  • 93
  • 575
  • 722
  • Hrm, `O(n)` means `n * c`, but `c` could be different for each solution. It is, however, probably pointless nano-optimisation. – slim Sep 19 '16 at 15:38
  • 1
    @Bohemian Thanks for the answer. When i was thinking about this, I was wondering if factors like Branch prediction would increase the performance of one over the other as it does in processing sorted Collections. http://stackoverflow.com/questions/11227809/why-is-it-faster-to-process-a-sorted-array-than-an-unsorted-array?rq=1 – Justin A Sep 19 '16 at 16:04