Two-criteria priority queue

Question

Is there a not too complicated way how to implement a priority queue working with two criteria? The queue gets created with 2 Comparators and provides (besides add) the operations poll1() and poll2(), where each removes and returns the smallest element according to the corresponding comparator.

Note that it has nothing in common with these two questions.

Motivation

My use case is Branch and Bound Optimization. Expanding the candidate with the best bound is provably optimal when you're given an unlimited time. Assuming an unlimited time is provably wrong.

Strictly following this strategy often ends up with having no solution at all when the deadline comes. A simple band-aid is to direct the search towards a solution first and then switch to the best bound strategy. This is rather unsatisfactory as the first solution found can be of an arbitrary low quality.

That's why I'd like to use the two criteria queue: In one step, expand the best bound candidate and in another, expand the "best looking" candidate according to some heuristics.

Another possible use would be for pareto-optimization.

Agreed. The mechanics of a `PriorityQueue` rely on being able to compare to items. — Micah Smith, Aug 20 '14 at 21:55
@SotiriosDelimanolis How would you remove the element obtained by `poll1()` from the second queue? — maaartinus, Aug 20 '14 at 22:12
Store it temporarily, call `remove(Object)` (passing it in) on the second `PriorityQueue`. — Sotirios Delimanolis, Aug 20 '14 at 22:14
I'm not sure exactly, but should be less than `O(n)`. The removal isn't like `poll`, from the top. It can remove anywhere and still has to re-organize the underlying structure (probably a balanced heap/tree in most implementations). — Sotirios Delimanolis, Aug 20 '14 at 22:22
@SotiriosDelimanolis `java.util.PriorityQueue` uses a heap rather than a tree as it's faster. With two `SortedSet`s it could work. — maaartinus, Aug 20 '14 at 22:26
Two priority queues, a facade and a wrapper with a value indicating wheter item has been already consumed (if it has, then call poll again...) — Sami Korhonen, Aug 31 '14 at 01:26
@SamiKorhonen This sounds good, but it can keep too many elements in the queue too long. Maybe some periodic cleanup would solve it. — maaartinus, Aug 31 '14 at 01:44
You could set reference from wrapper to actual object as null once you mark the object as used. I would not expect you to encounter any problems, even if you had few thousand invalid references in the queue — Sami Korhonen, Aug 31 '14 at 01:54
I guess you could have a counter of invalid references per queue. And if counter goes above specified threshold, you would just use iterator to remove them. You could use a ReadWrite lock to sync cleanups: use read lock for polling and write lock for cleanup. Atomics should work well as counters — Sami Korhonen, Aug 31 '14 at 02:05
@SamiKorhonen This nulling can't work in general, as the actual object is needed for sorting. You'd need to store the value used for sorting in the wrapper and I'm not sure if it's smaller than my actual object. ++ The counter idea sounds good. — maaartinus, Aug 31 '14 at 02:06
Tru, you would need to cache weight in the wrapping element. Without knowing the actual data and use case, it's hard to say what would be an efficient implementation — Sami Korhonen, Aug 31 '14 at 13:00
Sounds somewhat interesting, but I'd be curious to see how this is intended to be used (particularly, a convincing example where this makes more sense than two separate priority queues) — Marco13, Aug 31 '14 at 14:59

jxh · Answer 1 · 2014-08-31T21:24:08.377

3

Use two TreeSets. Each set takes their respected comparator, and the add() method adds the provided value to each set. poll1() would acquire the first() value in the first set, and remove the value from both sets. Similarly for poll2().

add(), poll1() and poll2() will have Ο(log n).

An example implementation follows (the real thing probably needs more error checking and additional methods):

class Priority2Queue <E>
{
    private TreeSet<E> q1;
    private TreeSet<E> q2;

    public Priority2Queue (Comparator<E> c1, Comparator<E> c2) {
        q1 = new TreeSet<E>(c1);
        q2 = new TreeSet<E>(c2);
    }

    public void add (E e) {
        q1.add(e);
        q2.add(e);
    }

    private E pollx (TreeSet<E> a, TreeSet<E> b) {
        E top = a.first();
        a.remove(top);
        b.remove(top);
        return top;
    }

    public E poll1 () { return pollx(q1, q2); }
    public E poll2 () { return pollx(q2, q1); }
}

edited Aug 31 '14 at 21:24

answered Aug 31 '14 at 15:11

jxh

69,070
8
110
193

+1 This is simple and sure to work, but pretty wasteful as a `TreeSet` takes much more memory and probably also time than a heap. – maaartinus Sep 03 '14 at 19:23
@maaartinus: I picked a data structure that would provide the correct big-O behavior. If your queue sizes are relatively small, two priority queues are probably sufficient. If you expect your queues to be very large, consider using a B+Tree implementation instead of `TreeSet`. – jxh Sep 03 '14 at 19:44
Surprisingly, your solution is way faster than my attempt to kajacx's answer. See my comments below his answer for links, if interested. – maaartinus Sep 04 '14 at 01:55
I see now, I've implemented something else... using two `PriorityQueue`s and keeping track of removals. I'll try your solution, too, one day. – maaartinus Sep 04 '14 at 03:41

djechlin · Answer 2 · 2014-08-31T12:49:12.737

2

Use two priority queues.

One way to do this is tag an item for deletion when it's next in one's poll. If the other queue sees it, skip and move to next. I'm not sure the complexity of this, but it will depend on subtle questions such as how much longer one item remains in a queue than the other. I would suggest profiling for your actual situation.

edited Aug 31 '14 at 12:49

answered Aug 31 '14 at 12:36

djechlin

59,258
35
162
290

+1 This is good, except for one thing, namely possible accumulation of already consumed things in one of the queues (as you noted). It may work well and profiling is no bad idea, but it can't tell me what happens for other inputs. – maaartinus Sep 03 '14 at 19:25

kajacx · Answer 3 · 2014-10-19T12:49:14.430

2

If you care only about asymptotic complexity, i would go with @jxh's solution, which uses O(n) memory and guaranteed time O(log(n)) per operation (that is the best possible) and requires very little coding.

However TreeSet is backed by TreeMap and that uses TreeEntry for entries, and one entry will take total 32B of memory, plus having 2 trees will result into 64B of memory per element!

EDIT 2: The old answer was incorrect, I have moved it here if anyone wants to see it anyway. Follows (hopefully) correct answer.

The idea is really simple. Let's have 2 ordinary heaps, one sorted via the first comparator, the other one sorted via the second comparator. And with that 2 indexing arrays that will link the elements between the heaps.

For example if element X is in position 2 in heap 1 and in position 8 in heap 2, then the linking arrays will look like this:

linkFrom1To2[2] = 8;
linkFrom2To1[8] = 2;

So if you perform any element moving in either of the heaps, you have to update these arrays accordingly.

But now if you remove minimum from one heap, you know where that element was in the second heap, so you can remove it from there as well.

I have posted the full code here, this solution requires only 16B per element and was even faster that the @jxh's solution in the tests I've done, but I used simple arrays with maximal capacity instead of some auto-growing structures like ArrayList.

edited Oct 19 '14 at 12:49

answered Sep 01 '14 at 09:42

kajacx

12,361
5
43
70

Could you post your code? AFAIK the idea is similar to [MinMaxPriorityQueue](http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/collect/MinMaxPriorityQueue.html), so I wonder why it doesn't work. – maaartinus Sep 02 '14 at 17:04
Sure, look at at the code [here](http://pastebin.com/HC3Y3qdj), but it's not much cleaned-up, however you should see first inspection fail at epoch 19. The `add()` method doesn't work, because I forgot that when element changes floors, it's comparator towards lower elements also might change. – kajacx Sep 02 '14 at 18:24
+1 purely for having the guts to put together your own code using well-known concepts. – djechlin Sep 03 '14 at 19:59
I guess the problem is the order of steps 1 and 2 in `add`. By swapping with the grandparent you may introduce an inconsistency with the parent which never gets fixed. I [tried the other way round](https://dl.dropboxusercontent.com/u/4971686/published/maaartin/so/TwofoldPriorityQueue.java) and so far the [test](https://dl.dropboxusercontent.com/u/4971686/published/maaartin/so/TwofoldPriorityQueueTest.java) seems to pass. ([interface](https://dl.dropboxusercontent.com/u/4971686/published/maaartin/so/ITwofoldPriorityQueue.java)) – maaartinus Sep 03 '14 at 22:04
I was wrong, swapping the order doesn't suffice, the `selftest` was badly broken. I'm afraid that after a swap, more things have to be check and possibly fixed. – maaartinus Sep 03 '14 at 23:22
I've got it working, but have to look at grandparent, parent, both children, and all grandchildren in each step. Currently, it's [dead slow](https://microbenchmarks.appspot.com/runs/ff079c65-4763-46db-9ff2-be68c192bb0f) as my [benchmark](https://dl.dropboxusercontent.com/u/4971686/published/maaartin/so/TwofoldPriorityQueueBenchmark.java) shows. [jxh's solution](https://dl.dropboxusercontent.com/u/4971686/published/maaartin/so/SimpleTwofoldPriorityQueue.java) is way faster, but I like your idea more. – maaartinus Sep 04 '14 at 01:53

Strikeskids · Accepted Answer · 2014-08-31T22:44:57.027

1

The object of this task would hopefully be to maintain the same amoritized runtimes for the goal priority queue.

These runtimes are O(log n) for insertion and removal.

Maintain two priority queues each with their own comparator. Maintain a map of the counts of the objects added. When an object is removed from the queue, decrement the count. If the object you attempt to poll from the queue has a count <= 0, then poll another item.

The following implementation maintains the runtimes. It has amoritized O(log n) for insertion and removal. Because each element is added to each queue and removed from each queue only once, it must be a constant multiple of the runtime of a single queue. In addition, the map has insertion and removal times of O(1), which does not contribute to the O(log n) runtime for the queues.

public class DualPriorityQueue<T> {

    private Queue<T> queue1, queue2;
    private Map<T, Integer> counts;

    public DualPriorityQueue(int size, Comparator<T> c1, Comparator<T> c2) {
        queue1 = new PriorityQueue(size, c1);
        queue2 = new PriorityQueue(size, c2);
        counts = new HashMap<T, Integer>(size);
    }

    public T poll1() {
        return poll(queue1);
    }

    public T poll2() {
        return poll(queue2);
    }

    private T poll(Queue<T> queue) {
        T t = null;
        while (!queue.isEmpty() && !removeCount(t = queue.poll())) {
            t = null;
        }
        return t;
    }

    public boolean offer(T t) {
        queue1.offer(t);
        queue2.offer(t);
        addCount(t);
        return true;
    }

    private boolean removeCount(T t) {
        final Integer value = counts.get(t);
        if (value != null && value > 0) {
            final int newValue = value - 1;
            if (newValue == 0) {
                counts.remove(t);
            } else {
                counts.put(t, newValue);
            }
            return true;
        }
        return false;
    }

    private void addCount(T t) {
        final Integer prev = counts.get(t);
        if (prev == null || prev <= 0) {
            counts.put(t, 1);
        } else {
            counts.put(t, prev + 1);
        }
    }
}

edited Aug 31 '14 at 22:44

answered Aug 31 '14 at 14:17

Strikeskids

3,932
13
27

The counting is needless complicated, you only need a `Set`. I actually implemented your solution (not exactly and not completely, a lot of tests are missing). See the links below another answer if interested. The remaining problem is the possible accumulation of invalid entries. – maaartinus Sep 04 '14 at 23:13
As I mentioned earlier, if your queues are relatively small, using two priority queues is fine (I am not sure I understand the need for the marking). The reason you want to keep your queues small with `PriorityQueue` is that removal is O(n) not O(log n). This implementation of `poll()` has O(n log n) worst case, which makes its average case performance difficult to analyze. – jxh Sep 04 '14 at 23:47
Both of you are incorrect. A set would not account for multiple instances of the aame object properly. Because a priority queue can have two of the same object, that is necessary. Additionally, the amoritized performance is not difficult to calculate. Although the worst case of one poll could be nlogn, the average case performance over the entire time the structure is in use is log n. Because each element is only added and remove once from each priority queue, the overall performance is log n. – Strikeskids Sep 05 '14 at 09:56
@Strikeskids: Perhaps amortized over the whole data structure usage, you may be right about the average performance. However, if the usage is mostly tilted towards one queue, then it can happen that usage of the other queue can always generate O(n log n) behavior. More worrisome is that a tilted use case can cause the rarely used queue to use unbounded memory. – jxh Sep 05 '14 at 23:03
@jxh the whole point of the algorithmic runtime estimation is to estimate over the entire queue. If you are so worried about the size of the other queue building up, then you can purge the elements that don't exist in the offending queue when it reaches double the total size of the double-queue. Even if one execution of the poll method takes significantly longer than the others, the overall execution time of the poll will be log n – Strikeskids Sep 06 '14 at 01:31
@Strikeskids: The purge operation will also be O(n log n). I am just noting that this solution is not guaranteed to provide a "uniform" latency experience for all calls to `poll`. Perhaps you don't care about predictable latency, but there may be those that do. – jxh Sep 06 '14 at 01:47
@jxh evidently if you desire predictable latency, then this is not the implementation for you. Usually, the desire is to have the overall algorithm run efficiently, and this will provide the functionality in that use case. – Strikeskids Sep 06 '14 at 01:52
@Strikeskids: I grant for the stated application (branch-bound optimization), this solution is likely sufficient, assuming the decision making is not tied to a real-time interface. – jxh Sep 06 '14 at 02:22

Zixradoom · Answer 5 · 2014-08-31T13:24:25.333

You could do something akin to this.

public final class MyPriorityQueue < T >
{
  private static final int INITIAL_CAPACITY = 10;

  private final Queue < T > q1;
  private final Queue < T > q2;
  private final List < Queue < T > > queues;

  public MyPriorityQueue ( Comparator < T > comparator1,
      Comparator < T > comparator2 )
  {
    q1 = new PriorityQueue < T > ( INITIAL_CAPACITY, comparator1 );
    q2 = new PriorityQueue < T > ( INITIAL_CAPACITY, comparator2 );
    queues = Arrays.asList ( q1, q2 );
  }

  public void enqueue ( T t )
  {
    for ( Queue < T > queue : queues )
      queue.add ( t );
  }

  public T poll1 ()
  {
    T returnValue = q1.poll ();
    q2.remove ( returnValue );
    return returnValue;
  }

  public T poll2 ()
  {
    T returnValue = q2.poll ();
    q1.remove ( returnValue );
    return returnValue;
  }
}

One other modification might be this.

public final class MyPriorityQueue < T >
{
  private static final int INITIAL_CAPACITY = 10;

  private final Map < Comparator < T >, Queue < T > > queues;

  public MyPriorityQueue ( List < Comparator < T > > comparators )
  {
    queues = new HashMap < Comparator < T >, Queue < T > > ();
    for ( Comparator < T > comparator : comparators )
    {
      Queue < T > q = new PriorityQueue < T > ( INITIAL_CAPACITY, comparator );
      queues.put ( comparator, q );
    }
  }

  public void enqueue ( T t )
  {
    for ( Queue < T > queue : queues.values () )
      queue.add ( t );
  }

  public T poll ( Comparator < T > comparator )
  {
    Queue < T > q = queues.get ( comparator );
    if ( q == null )
      return null;
    T returnValue = q.poll ();
    for ( Queue < T > queue : queues.values () )
      if ( queue != q )
        queue.remove ( returnValue );
    return returnValue;
  }
}

lolz, just the style I got used too. Most of my college friends hated it. — Zixradoom, Aug 31 '14 at 12:47
@MarcoAcierno I guess because of spaces being cheap to print as they costs no ink? However, seeing so many them makes me eyes cry. The other problem is that `poll1` keeps the element in `q2`, which means I'd process it twice. — maaartinus, Aug 31 '14 at 13:14
My intra-brain-parser parser keeps complaining about "uncomparable types" when reading `Queue < T` :-( — Marco13, Aug 31 '14 at 14:57

Two-criteria priority queue

Motivation

5 Answers5