2

I'm using ArrayList to store CustomObjects, problem is that it takes time to find an object in that list in order to remove it. (around 100K elements).

I'm forced to use ArrayList as I have to keep the order as is.

I was thinking of updating an HashMap<Object, Integer> to keep track of indexes.

Problem is when I delete an element in the list, i also have to update all indexes in my map, which is slow !

I also must be able to find elements in my list by both Index and Object.

If you can lead me to some sort of solutions :)
Thanks.

EDIT: I implemented from scratch needed LinkedList, its working like a charm, thanks for your help everyone tho :)

Glidro
  • 41
  • 5
  • You can use a [LinkedHashMap](https://docs.oracle.com/javase/7/docs/api/java/util/LinkedHashMap.html) - it's an implementation of the `Map` interface that maintains ordering like a `List` – AesSedai101 Oct 06 '17 at 13:05
  • WIth 100K elements, is there any reason that you couldn't keep these objects in an AVL tree? How are your key-value pairs configured? Why are you restricted to using a 1-dimensional structure like `ArrayList`? – Chris Phillips Oct 06 '17 at 13:17
  • Because I don't need Key-Value pair, and ArrayList because I want to keep order of addition of the items. – Glidro Oct 06 '17 at 13:31
  • Would it be an option to mark an object as deleted by simply setting the ArrayList element to `null`? That way, the indexes of your other objects won't change. – DodgyCodeException Oct 06 '17 at 13:34
  • It's an option but it's too dirty and would require lots of code modifications to check if value is null in the list. – Glidro Oct 06 '17 at 13:36

1 Answers1

5

I'm forced to use ArrayList as I have to keep the order as is.

Fortunately, you are not really forced to use ArrayList, because you have another option: use LinkedHashSet, which preserves order, and gives you an option to find and remove items in O(1).

With this change you would no longer be able to access items by their index, and you would need to ensure that the items you insert into the container have proper implementations for hashCode and equals.

I need to find my elements with indexes too

A somewhat "dirtier" option that is available when only a small percentage of items is ever removed is keeping nulls in the list instead of actually deleting the item. This way you could keep HashMap<Object,Integer> in sync with your ArrayList, at the expense of making additional null checks every time you access an item from the array list. Note that if you use this approach, the length of the list no longer signifies the number of items in your collection. Instead, you would need to use the number of items in the hash map.

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • If you need to store multiple times the same object in the collection, you should use `LinkedList` instead of `LinkedHashSet` – Olivier Boissé Oct 06 '17 at 13:10
  • Well I tried using an LinkedList, everything went way slower, from adding to removing. – Glidro Oct 06 '17 at 13:13
  • 2
    @Glidro Right, `LinkedList` wouldn't help, it would just be slower pretty much everywhere. – Sergey Kalinichenko Oct 06 '17 at 13:20
  • 1
    @Glidro `LinkedList` has an insertion/deletion avg complexity of O(n), so you really only want to use this structure when n values are low (where n is the number of values you're planning on putting into your data structure). – Chris Phillips Oct 06 '17 at 13:22
  • And I need to find my elements with indexes too, so LinkedHashSet is not possible – Glidro Oct 06 '17 at 13:24
  • 1
    @Glidro You will need to figure out a way to not access objects by index, using some other key instead. Otherwise all associative containers would be out of consideration, leaving you with slow O(n) options. – Sergey Kalinichenko Oct 06 '17 at 13:31
  • @Glidro do you need to find elements by index in O(1) time? If O(n) is acceptable, you could "index into" the LinkedHashSet by iterating n times. – DodgyCodeException Oct 06 '17 at 13:38
  • @dasblinkenlight the null option is interesting but not viable, because the main purpose of this talk is to improve performances when removing many elements, in fact over 100K elements we might get rid of 60K at any given time, so keeping nulls is not possible. – Glidro Oct 06 '17 at 13:51
  • @Glidro 100K/60K does not sound too bad, I'd actually give it a try. If you delete in batches, make a method that goes through the list+hash set for the entire batch, removing one item at a time, then walks through the remaining items, and deletes `null`s. You could also optimize removal "straight" from the array list if you collect all items to remove upfront, then walk through the list once, copy the items that you want to keep into another list, and then replace the original list with the copy. If you do 60K items in one batch, copying may actually be faster. – Sergey Kalinichenko Oct 06 '17 at 14:07
  • Yup but the 60K are "random", I mean the process is never removing first or last 60K elements, it can remove 114 random, 2547 random, or even 90% of the whole items. – Glidro Oct 06 '17 at 14:14
  • @Glidro Do you remove them all at once, or do you remove some, than add some, then remove some more, then do something else before removing some additional ones? – Sergey Kalinichenko Oct 06 '17 at 14:18
  • The process can add and remove all the time, basically it's stocking shapes to be displayed by a renderer. So many shapes can get removed/added any time, but in most cases, shapes are getting removed. Tho, if deleting, all the shapes to be deleted are deleted at once, not delayed. – Glidro Oct 06 '17 at 14:21
  • @Glidro Since all the shapes that you delete are deleted at once, you should be able to make a `HashSet` of k shapes to be deleted in O(k), then walk the list once and collect all shapes that you want to keep into a new list in O(n). Finally, replace the original list with the new one. This will save you from doink O(n*k), which may be a lot slower. – Sergey Kalinichenko Oct 06 '17 at 14:28