My goal is a sorted data structure that can accomplish 2 things:
- Fast insertion (at the location according to sort order)
- I can quickly segment my data into the sets of everything greater than or less than or equal to an element. I need to know the size of each of these partitions, and I need to be able to "get" these partitions.
Currently, I'm implementing this in java using an ArrayList
which provides #2 very easily since I can perform binary search (Collections.binarySearch
) and get an insertion index telling me at what point an element would be inserted. Then based on the fact that indices range from 0 to the size of the array, I immediately know how many elements are greater than my element or smaller than my elements, and I can easily get at those elements (as a sublist). However, this doesn't have property #1, and results in too much array copying.
This makes me want to use something like a SkipList or RedBlackTree that could perform the insertions faster, but then I can't figure out how to satisfy property #2 without making it take O(N) time.
Any suggestions would be appreciated. Thanks
EDIT: Thanks for the answers below that reference data structures that perform the insertion in O(logN) time and that can partition quickly as well, but I want to highlight the size() requirement - I need to know the size of these partitions without having to traverse the entire partition (which, according to this is what the TreeSet does. The reasoning behind this is that in my use case I maintain my data using several different copies of data structures each using a different comparator, and then need to ask "according to what comparator is the set of all things larger than a particular element smallest". In the ArrayList
case, this is actually easy and takes only O(YlogN) where Y is the number of comparators, because I just binary search each of the Y arrays and return the arraylist with the highest insertion index. It's unclear to me how I could this with a TreeSet without taking O(YN).
I should also add that an approximate answer for the insertion index would still be valuable even if it couldn't be solved exactly.