3

I have an object called Instance with 2 fields, an array of features (which is another object) that represent columns in a dataset, such as age, sex, class, etc; and their values (i.e a number). I also have implemented a custom comparator that can sort a list of these objects based on a particular feature of the instance as follows:

Comparator<Instance> comparator = Comparator.comparing(c -> c.get(feature));
Instance[] sorted = instList.stream().sorted(comparator).toArray(Instance[]::new);

Now, this code works fine, however, there are many cases in which the feature that I am sorting by has the same value as another instance. In this case, how does Java decide how to continue sorting the list?

George Francis
  • 462
  • 2
  • 7
  • 16

2 Answers2

6

Quoting from the Java API for Stream.sorted():

For ordered streams, the sort is stable. For unordered streams, no stability guarantees are made.

A stream over a List is ordered, which means a stable sort algorithm is used. A stable sort guarantees that equal elements will not be swapped. Elements that compare equal are left in the same relative order as in the starting list.

All of the sorting methods in the standard library have similar guarantees:

  • Collections.sort()

  • Arrays.sort()

  • Arrays.parallelSort()

    This sort is guaranteed to be stable: equal elements will not be reordered as a result of the sort.

    ...

    The documentation for the methods contained in this class includes briefs description of the implementations. Such descriptions should be regarded as implementation notes, rather than parts of the specification. Implementors should feel free to substitute other algorithms, so long as the specification itself is adhered to. (For example, the algorithm used by sort(Object[]) does not have to be a MergeSort, but it does have to be stable.)

  • List.sort()

    Implementation Note:
    This implementation is a stable, adaptive, iterative mergesort...

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
0

sex

Don't ask for this unless you have a particular reason for it (not just as a general kindness, but also because needless asking means your application probably cannot be used for EU government business soon. Might as well get in the habit of unlearning this).

In this case, how does Java decide how to continue sorting the list?

If 2 objects are considered 'on equal footing', then what happens depends on what you're doing.

For Collections.sort, Arrays.sort, streams, and anything else where uniqueness is not a guaranteed property, the sorting order is more or less 'arbitrary' between these two.

For uniquefying concepts like a TreeSet, 'on equal footing' means equal, and no 2 items 'on equal footing' can be in the same treeset.

I have an object called Instance with 2 fields

Note that you can chain comparators: You can tell a comparator to resort to using another comparator if it cannot differentiate two objects:

Comparator<Instance> comparator = Comparator.comparingInt(Instance::getAge)
  .thenComparing(Comparator.comparing(Instance::getKind));
rzwitserloot
  • 85,357
  • 5
  • 51
  • 72