0

I must work with a Collection, and I am not sure about using a List or a Set. This collection must be sorted, but not by the order of insertion but for another one, so each time a new item is added, a Comparator should be executed in order to reorder the Collection. So, for this reason, an ArrayList could be the best option.

Removing objects from that Collection must be possible too, furthermore, I would really appreciate using removeIf method, so a Set would be the best option here.

Getting and iterating over the Collection will be the most repeated scenario, so it must have a good performance in this scenario.

Seeing that, I think that a Set would be a good decision, however, I was thinking about converting the Set into a List when adding items, then, once the list has been resorted, converting it back to a Set. Is it bad performing? What do you think?

Thanks in advance

Michael
  • 41,989
  • 11
  • 82
  • 128
FVod
  • 2,245
  • 5
  • 25
  • 52
  • 3
    Do you want your `Collection` to be able to contain duplicate elements? – Eran Feb 19 '18 at 07:38
  • 1
    *so each time a new item is added, a Comparable should be executed in order to reorder the Collection. So, for this reason, an ArrayList could be the best option* - No. Actually, an `ArrayList` is sorted by insertion order, which you don't want. A `Set` is sorted each time you ad an element, so it would be the best option here. – BackSlash Feb 19 '18 at 07:39
  • Can you elaborate on what you're trying to achieve with this list? Then we might be able to advise you better – Sam Feb 19 '18 at 07:41
  • 2
    what do you think about TreeSet? – Rishikesh Dhokare Feb 19 '18 at 07:42
  • 4
    Have you looked at [`TreeSet`](https://docs.oracle.com/javase/8/docs/api/java/util/TreeSet.html)? And why do you need a `Set` to use `removeIf`? It is a default method of all `Collection`s. Converting between list and set is defintely not helpful, as you'd loose the ordering (except you convert to `TreeSet`, of course, but then you do not need a list anyway). – Malte Hartwig Feb 19 '18 at 07:45
  • The problem of TreeSet is that iterating over it is slower than over an ArrayList or LinkedHashSet. The Collection should not have duplicated elements. – FVod Feb 19 '18 at 08:01
  • May be relevant: [How do I write a correct micro-benchmark in Java?](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java) – lexicore Feb 19 '18 at 09:19

2 Answers2

2

Unless you have bulk inserts during which you would need no sorting, TreeSet is fine. Simply measure both solutions.

With TreeSet inserting already ordered items, like rereading a set from disk, performs bad in that even a balanced tree, will have a bit too large depth. That however can be remedied.

For better performance you might go for a B-tree (needs 3rd party code) instead of the binary TreeSet. Measure that too, as typically a facet such as deletion with rebalancing might be done suboptimally.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
1

This depends a lot on how you fill and use your collection and performance of which operation is the most important.

Do you fill the collection with items at once? Or add new elements from time to time? Does the performance of adding elements matter? Or only the iteration performance is important?

If performance is critical, it might make sense to implement a few solutions and compare their performance using a benchmark.

I personally don't believe that iteration performance of a TreeSet is that much worse that ArrayLists or LinkedLists or LinkedHashMaps. Especially compared to linked data structures. Iteration on a tree should not be that different in the performance. But I have no data, so this is just a belief here.

Below are two implementation ideas.

First, if you load a lot of data at once and then add new items rather seldom, load the data into an ArrayList and sort it using Collections.sort. If you need to add another item do a binary search (Collections.binarySearch) and insert the element at the corresponding position. Wrap it all in a custom List implementation and you're good to go.

Next, if you fill the collection with the data "in the beginning" and then the collection is hardly modified, you may simply cache the iteration order in an ArrayList. Every time the collection is modified, reset this list and. When iteration is requested and the list is not null, just use it, otherwise first fill it in the order of the sorted set.

lexicore
  • 42,748
  • 17
  • 132
  • 221