It's overengineered for just this problem. Also, your code is faulty:
- It's
Integer
, not Int
(minor niggle)
- More importantly, a
remove
call removes the first matching element, and to make matters considerably worse, remove
on lists is overloaded: There's remove(int)
which removes an element by index, and remove(Object)
which removes an element by looking it up. In a List<Integer>
, it is very difficult to know which one you're calling. You want the 'remove by lookup' one.
On complexity:
On modern CPUs, it's not that simple. The CPU works on 'pages' of memory, and because fetching a new page takes on the order of 500 cycles or more, it makes more sense to simplify matters and consider any operation that does NOT require a new page of memory to be loaded, to be instantaneous.
That means that if we're talking about a list of, say, 10,000 numbers or fewer? None of it matters. It'll fly by. Any debate about 'efficiency' is meaningless until we get to larger counts.
Assuming that 'efficiency' is still relevant:
- integers don't have hashcode collisions.
- hashmaps with few to no key hash collisions are effectively O(1) on all single element ops such as 'add' and 'get'.
- arraylist's .remove(Object) method is O(n). It takes longer the larger the list is. In fact, it is doubly O(n): it takes O(n) time to find the element you want to remove, and then O(n) time again to actually remove it. For fundamental informatics twice O(n) is still O(n) but pragmatically speaking,
arrayList.remove(item)
is pricey.
- You're calling .remove about O(n) times, making the total complexity O(n^2). Not great, and not the most efficient algorithm. Practically or fundamentally.
An efficient strategy is probably to just commit to copying. A copy operation is O(n). For the whole thing, instead of O(n) per item. Sorting is O(n log n). This gives us a trivial algorithm:
- Sort the input. Note that you can do this with an
int[]
too; until java 16 is out and you can use primitives in collections, int[]
is an order of magnitude more efficient than a List<Integer>
.
- loop through the sorted input. Don't immediately copy, but use an intermediate: For the 0th item in the list, remember only 'the last item was FOO' and 'how many times did I see foo?'. Then, for any item, check if it is the same as the previous. If yes, increment count. If not, check the count: if it was 1, add it to the output, otherwise don't. In any case, update the 'last seen value' to the new value and set the count to 1. At the end, make sure to add the last remembered value if the count is 1, and make sure your code works even for empty lists.
That's O(n log n) + O(n)
complexity, which is O(n log n)
- a lot better than your O(n^2)
take.
Use int[]
, and add another step that you first go through juuust to count how large the output would be (because arrays cannot grow/shrink), and now you have a time complexity of O(n log n) + 2*O(n)
which is still O(n log n)
, and the lowest possible memory complexity, as sort is in-place and doesn't cost any extra.
If you really want to tweak it, you can go with a space complexity of 0 (you can write the reduced list inside the input).
One problem with this strategy is that you mess up the ordering in the input. The algorithm would produce 2, 3, 7
. If you want to preserve order, you can combine the hashmap solution with the sort, and make a copy as you loop solution.