17

Is there an existing List implementation in Java that maintains order based on provided Comparator?

Something that can be used in the following way:

Comparator<T> cmp = new MyComparator<T>();
List<T> l = new OrderedList<T>(cmp);
l.add(someT);

so that someT gets inserted such that the order in the list is maintained according to cmp

(On @andersoj suggestion I am completing my question with one more request)

Also I want to be able to traverse the list in sorted order without removing the elements, i.e:

T min = Const.SMALLEST_T;
for (T e: l) {
  assertTrue(cmp.compare(min, e) >= 0);
  min = e;
}

should pass.

All suggestions are welcome (except telling me to use Collections.sort on the unordered full list), though, I would prefer something in java.* or eventually org.apache.* since it would be hard to introduce new libraries at this moment.

Note: (UPDATE4) I realized that implementations of this kind of list would have inadequate performance. There two general approaches:

  1. Use Linked structure (sort of) B-tree or similar
  2. Use array and insertion (with binary search)

No 1. has problem with CPU cache misses No 2. has problem with shifting elements in array.

UPDATE2: TreeSet does not work because it uses the provided comparator (MyComparator) to check for equality and based on it assumes that the elements are equal and exclude them. I need that comparator only for ordering, not "uniqueness" filtering (since the elements by their natural ordering are not equal)

UPDATE3: PriorityQueue does not work as List (as I need) because there is no way to traverse it in the order it is "sorted", to get the elements in the sorted order you have to remove them from the collection.

UPDATE:

Similar question:
A good Sorted List for Java
Sorted array list in Java

Community
  • 1
  • 1
Op De Cirkel
  • 28,647
  • 6
  • 40
  • 53
  • 5
    This behavior would violate the `List` contract, and would generally be a Bad Idea. (Guava has considered, and rejected, such features.) – Louis Wasserman May 20 '12 at 17:13
  • 2
    Which part of the `List` contract would be violated? – Op De Cirkel May 20 '12 at 17:14
  • 3
    The specification for `add(E)` says "Appends the specified element to the end of this list." Adding it anywhere else in the list violates the contract. – Louis Wasserman May 20 '12 at 17:17
  • `TreeSet` and `TreeMap` violate the contract of `Set` and `Map` accordingly, but they are still there (in `java.*`) – Op De Cirkel May 20 '12 at 17:23
  • Why would you want such a class to implement `List` when `List` behaves quite differently? (Ignore the fact that the collections interface are "broadly" defined, and even then Java library implementations break the contract.) – Tom Hawtin - tackline May 20 '12 at 17:23
  • 1
    No, they don't violate the `Set` or `Map` contract -- at least not with a comparator that is consistent with equals, which is totally fair. Even in that case, their behavior is "more or less" similar to their contractual obligations, whereas the approach you describe here would just _completely_ violate the contract. – Louis Wasserman May 20 '12 at 19:30
  • @Louis Wasserman You are right, I agree. – Op De Cirkel May 20 '12 at 19:38
  • @Op De Cirkel: RE your equality/uniqueness problem... see my answer. – andersoj May 20 '12 at 22:21
  • OK, so you also want it iterable. There is a reason such a thing does not exist, because the performance is a challenge. Do you REALLY need to maintain the list in sorted order AND traverse an iterator in order? – andersoj May 20 '12 at 22:42
  • Can you write down which `List` features you need to use? – andersoj May 20 '12 at 22:51
  • Near duplicate: http://stackoverflow.com/questions/416266/sorted-collection-in-java – andersoj May 20 '12 at 23:08

3 Answers3

17

You should probably be using a TreeSet:

The elements are ordered using their natural ordering, or by a Comparator provided at set creation time, depending on which constructor is used.

Example:

Comparator<T> cmp = new MyComparator<T>();
TreeSet<T> t = new TreeSet<T>(cmp);
l.add(someT);

Note that this is a set, so no duplicate entries are allowed. This may or may not work for your specific use-case.

beerbajay
  • 19,652
  • 6
  • 58
  • 75
  • 3
    Note though that there's a big difference between a `List` and a `Set`: One allows duplicate elements, one does not. – Voo May 20 '12 at 17:16
  • Good point; I've updated the answer to make sure this is clear. – beerbajay May 20 '12 at 17:18
  • It should be possible to circumvent the no duplicates restriction by inserting wrapper objects (W) that each contain a pointer to some T object, then somehow make sure that those W objects are treated as different even if they point to the same T. This can be done by defining a lexicographic order on W's. In particular, to compare two W's, first compare the corresponding T's, and only if they're equal, compare the identity of the two W's. Though I'm not sure how the latter is done in Java; in C, you would just use the < operator on the two W pointers. – Ambroz Bizjak May 20 '12 at 23:27
9

Response to new requirement. I see two potentials:

  • Do what the JavaDoc for PriorityQueue says:

    This class and its iterator implement all of the optional methods of the Collection and Iterator interfaces. The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order. If you need ordered traversal, consider using Arrays.sort(pq.toArray()).

    I suspect this will yield the best performance given your requirements. If this is not acceptable, you'll need to better explain what you're trying to accomplish.

  • Build a List that simply sorts itself upon addition of new elements. This is a real pain... if you used a linked structure, you can do an efficient insertion sort, but locality is bad. If you used an array-backed structure, insertion sort is a pain but traversal is better. If iteration/traversal is infrequent, you could hold the list contents unsorted and sort only on demand.

  • Consider using a PriorityQueue as I suggested, and in the event you need to iterate in order, write a wrapper iterator:

    class PqIter implements Iterator<T>
    {
       final PriorityQueue<T> pq;
       public PqIter(PriorityQueue <T> source)
       {
         pq = new PriorityQueue(source); 
       }
    
       @Override
       public boolean hasNext()
       {
         return pq.peek() != null
       }
    
       @Override
       public T next()
       { return pq.poll(); }
    
       @Override
       public void remove()
       { throw new UnsupportedOperationException(""); }
    }
    
  • Use Guava's TreeMultiSet. I tested the following code with Integer and it seems to do the right thing.

    import com.google.common.collect.TreeMultiset;
    
    public class TreeMultiSetTest { 
      public static void main(String[] args) {
        TreeMultiset<Integer> ts = TreeMultiset.create();
        ts.add(1);  ts.add(0); ts.add(2);
        ts.add(-1); ts.add(5); ts.add(2);
    
        for (Integer i : ts) {
          System.out.println(i);
        } 
      } 
    }
    

The below addresses the uniqueness/filtering problem you were having when using a SortedSet. I see that you also want an iterator, so this won't work.

If what you really want is an ordered list-like thing, you can make use of a PriorityQueue.

Comparator<T> cmp = new MyComparator<T>();
PriorityQueue<T> pq = new PriorityQueue<T>(cmp);
pq.add(someT);

Take note of what the API documentation says about the time properties of various operations:

Implementation note: this implementation provides O(log(n)) time for the enqueing and dequeing methods (offer, poll, remove() and add); linear time for the remove(Object) and contains(Object) methods; and constant time for the retrieval methods (peek, element, and size).

You should also be aware that the iterators produced by PriorityQueue do not behave as one might expect:

The Iterator provided in method iterator() is not guaranteed to traverse the elements of the priority queue in any particular order. If you need ordered traversal, consider using Arrays.sort(pq.toArray()).

I just noticed that Guava provides a MinMaxPriorityQueue. This implementation is array-backed, rather than the linked form provided in the JDK's PriorityQueue, and thus likely has different timing behavior. If you're doing something performance sensitive, you may wish to take a look. While the notes give slightly different (linear and logarithmic) big-O times, all those times should also be bounded, which may be useful.

There is not a List implementation per se that maintains ordering, but what you are likely looking for are implementations of SortedSet. A TreeSet is the most common. The other implementation, a ConcurrentSkipListSet is for more specific uses. Note that a SortedSet provides ordering, but does not allow duplicate entries, as does a List.

Refs:

Community
  • 1
  • 1
andersoj
  • 22,406
  • 7
  • 62
  • 73
  • Though, this does not solve my problem. It explains the situation and the possible choices well. – Op De Cirkel May 20 '12 at 22:42
  • According to the [Guava documentation](http://guava-libraries.googlecode.com/svn/tags/release02/javadoc/com/google/common/collect/TreeMultiset.html), the `TreeMultiList` isn't guaranteed to work for the use case where `compareTo()` isn't consistent to equals, as in the OP's post. The documentation says, "Warning: The comparison must be consistent with equals as explained by the `Comparable` class specification. Otherwise, the resulting multiset will violate the `Collection` contract, which it is specified in terms of `Object.equals(java.lang.Object)`." – Kevin Dec 31 '15 at 01:34
-1

I have a similar problem and I'm thinking of using a TreeSet. To avoid excluding "equal" elements I will modify the comparator so instead of returning 0 it will return a random number between (-1,1) or it will return always 1.

If you have no control over the Comparator or if you are using it for something else different than inserting this solution won't work for you.

Joan
  • 239
  • 4
  • 6