2

I have the following structures: list of lists of MyObject, and comparator that compares 2 MyObjects

The task is to get up to N lesser objects from the list.

The problem may be solved by several ways:

  1. To put all elements in one new list and to sort it using Collections.sort() or Arrays.parallelSort()

  2. To put all elements into ProirityQueue and then retrive N top elements

  3. To put all elements in SortedSet (TreeSet) and retrieve needed elements using iterator

I don't know wich way to choice. The creteria is performance. The size of "internal list" is about 20 elements and "outer" list size is about 10

bw_dev
  • 775
  • 1
  • 7
  • 17
  • 1
    so, did you do anything so far other than writing this question? Have you taken a look at time complexity of any operation on the data structure of each possible solution? – luk2302 Feb 12 '18 at 15:13
  • What does "N lesser objects" mean? Do you need N objects, that are smaller in comparison to one specific objetc, or do you need the N smallest objects of the whole construct? – Christoph Lembeck Feb 12 '18 at 15:13
  • 1
    With lists of sizes `20` and `10`, it's irrelevant performance-wise, but you can't for example use `SortedSet` if your elements aren't unique. – Kayaman Feb 12 '18 at 15:16
  • Many task have some vague "performance" criteria, but with just 200 elements it is very likely that it won't make any measurable difference no matter which approach you chose (Unless that code is a critical part that gets called thousand times a seconds or so) – kapex Feb 12 '18 at 15:17
  • 1
    Even if "the criteria is performance", it matters how often the list is changed vs. how often the reduced list is accessed. It's doubtful you'll get around profiling your options. – daniu Feb 12 '18 at 15:18
  • 1
    If you only need a partial sort, that is only `N` elements. Then the best would probably be the `PriorityQueue`, it was built exactly for that purpose. Put everything in it and then call `N` times the `pullMin` method (or how it was called). If you take a closer look at how things are realized internally, you will immediately see why `PriorityQueue` is a good choice for this task. But, as others said, for a list of only `20` elements things may be different. For bigger lists like `1mio` you will notice a heavy performance difference. – Zabuzard Feb 12 '18 at 15:30
  • The typical N is 10. The data not changed often, but the method is called very often (up to 100 times in second). Actually, I wrote this question because I created performance tests, and the best result is with Arrays.parallelSort(), while the analysis says that the best solution is PriorityQueue – bw_dev Feb 12 '18 at 15:52

1 Answers1

2

If you only want the lowest N elements out of about 200, and N is small (say 10-20) then why would you bother adding all elements to another structure (be it a list or PriorityQueue)? Just create an array (or a List if you must) of N items. Go through every element in the list of lists, and insert the element into the correct rank in the array of N. Discard the Nth element which gets dropped off the bottom of the list. This is like an Insertion Sort algorithm but only doing a partial sort.

If N is large (say 150 out of your total of 200 items) then a full sort or PriorityQueue might be better. You'll need to measure.

DodgyCodeException
  • 5,963
  • 3
  • 21
  • 42
  • Thanks! I will implement and test it right now. – bw_dev Feb 12 '18 at 15:56
  • 1
    I tested the solution that you proposed and it shows excellent result. The number of comparison operation dramatically reduced (I see this in the log). The results are: using parallel sort - about 18 ms on 10000 iteration, using Priority queue: about 60 ms, using "partial sort": about 7 ms on 10000. Thanks! – bw_dev Feb 13 '18 at 09:12