32

Consider this example which prints out some device type stats. ("DeviceType" is an enum with a dozenish values.)

Multiset<DeviceType> histogram = getDeviceStats();
for (DeviceType type : histogram.elementSet()) {
    System.out.println(type + ": " + histogram.count(type));
}

What's the simplest, most elegant way to print the distinct elements in the order of their frequency (most common type first)?

With a quick look at the Multiset interface, there's no ready-made method for this, and none of Guava's Multiset implementations (HashMultiset, TreeMultiset, etc) seem to automatically keep elements frequency-ordered either.

sebkur
  • 658
  • 2
  • 9
  • 18
Jonik
  • 80,077
  • 70
  • 264
  • 372

4 Answers4

41

I just added this feature to Guava, see here for the Javadoc.

Edit: usage example of Multisets.copyHighestCountFirst() as per the original question:

Multiset<DeviceType> histogram = getDeviceStats();
for (DeviceType type : Multisets.copyHighestCountFirst(histogram).elementSet()) {
    System.out.println(type + ": " + histogram.count(type));
}
sebkur
  • 658
  • 2
  • 9
  • 18
Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
  • Wow, thanks! So this will be included in Guava release 11 apparently? – Jonik Oct 09 '11 at 19:15
  • 1
    Yep. (I was the Guava intern this summer.) It might get renamed, though; see http://code.google.com/p/guava-libraries/issues/detail?id=356. – Louis Wasserman Oct 17 '11 at 06:23
  • I took the liberty to add a code example. Also marking this as accepted now. Thanks again for implementing the feature! – Jonik May 21 '12 at 19:20
  • 6
    Well, pretty nice, but it would be better with ability to choose ascending/descending order of sorting. – kolobok Sep 22 '12 at 11:49
  • I think I agree with @akapelko, I would have liked to see some more general purpose sort thing here. On the other hand, its nice to have this when I google for "sort multiset guava" in an attempt to get my multiset histogram into descending order. I am definitely pro-Guava. – Bob Kuhar Oct 22 '12 at 18:09
  • I cannot find this feature in Guava 18 anymore ([GitHub issue #356](https://github.com/google/guava/issues/356)). – Sonson123 Mar 18 '15 at 07:29
  • 1
    @Sonson123: I just tested with Guava 18.0 (the latest version), and [`copyHighestCountFirst()`](http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/collect/Multisets.html#copyHighestCountFirst(com.google.common.collect.Multiset)) works just fine. – Jonik May 19 '15 at 22:49
  • @Jonik, you are right - sorry for the confusion. (I may have used an older Guava version). – Sonson123 May 20 '15 at 06:57
7

Here's a method that returns a List of entries, sorted by frequency (UPDATE: used a flag to toggle ascending / descending order and used Guava's favorite toy: the Enum Singleton Pattern, as found in Effective Java, Item 3 ):

private enum EntryComp implements Comparator<Multiset.Entry<?>>{
    DESCENDING{
        @Override
        public int compare(final Entry<?> a, final Entry<?> b){
            return Ints.compare(b.getCount(), a.getCount());
        }
    },
    ASCENDING{
        @Override
        public int compare(final Entry<?> a, final Entry<?> b){
            return Ints.compare(a.getCount(), b.getCount());
        }
    },
}

public static <E> List<Entry<E>> getEntriesSortedByFrequency(
    final Multiset<E> ms, final boolean ascending){
    final List<Entry<E>> entryList = Lists.newArrayList(ms.entrySet());
    Collections.sort(entryList, ascending
        ? EntryComp.ASCENDING
        : EntryComp.DESCENDING);
    return entryList;
}

Test code:

final Multiset<String> ms =
    HashMultiset.create(Arrays.asList(
        "One",
        "Two", "Two",
        "Three", "Three", "Three",
        "Four", "Four", "Four", "Four"
    ));

System.out.println("ascending:");
for(final Entry<String> entry : getEntriesSortedByFrequency(ms, true)){
    System.out.println(MessageFormat.format("{0} ({1})",
        entry.getElement(), entry.getCount()));
}

System.out.println("descending:");
for(final Entry<String> entry : getEntriesSortedByFrequency(ms, false)){
    System.out.println(MessageFormat.format("{0} ({1})",
        entry.getElement(), entry.getCount()));
}

Output:

ascending:
One (1)
Two (2)
Three (3)
Four (4)
descending:
Four (4)
Three (3)
Two (2)
One (1)

Sean Patrick Floyd
  • 292,901
  • 67
  • 465
  • 588
  • This is pretty nice, quite possibly the simplest way for now while Multiset itself doesn't provide it! – Jonik Dec 04 '10 at 11:21
  • 2
    (Guava is so cool; I hadn't noticed e.g. the `Ints` class before. Looking at the API docs some more, `Files` was also new to me — a util collection similar to what Commons IO has, except with generics support & somehow *cleaner* overall.) – Jonik Dec 04 '10 at 11:23
3

An Implementation using ForwardingMultiSet :

(EntryComp from seanizer's answer)

enum EntryComp implements Comparator<Multiset.Entry<?>> {
    DESCENDING {
        @Override
        public int compare(final Entry<?> a, final Entry<?> b) {
            return Ints.compare(b.getCount(), a.getCount());
        }
    },
    ASCENDING {
        @Override
        public int compare(final Entry<?> a, final Entry<?> b) {
            return Ints.compare(a.getCount(), b.getCount());
        }
    },
}

public class FreqSortMultiSet<E> extends ForwardingMultiset<E> {
    Multiset<E> delegate;
    EntryComp comp;

    public FreqSortMultiSet(Multiset<E> delegate, boolean ascending) {
        this.delegate = delegate;
        if (ascending)
            this.comp = EntryComp.ASCENDING;
        else
            this.comp = EntryComp.DESCENDING;
    }

    @Override
    protected Multiset<E> delegate() {
        return delegate;
    }

    @Override
    public Set<Entry<E>> entrySet() {
        TreeSet<Entry<E>> sortedEntrySet = new TreeSet<Entry<E>>(comp);
        sortedEntrySet.addAll(delegate.entrySet());
        return sortedEntrySet;
    }

    @Override
    public Set<E> elementSet() {
        Set<E> sortedEntrySet = new LinkedHashSet<E>();
        for (Entry<E> en : entrySet())
            sortedEntrySet.add(en.getElement());
        return sortedEntrySet;
    }

    public static <E> FreqSortMultiSet<E> create(boolean ascending) {
        return new FreqSortMultiSet<E>(HashMultiset.<E> create(), ascending);
    }

    /*
     * For Testing
     * public static void main(String[] args) {
        Multiset<String> s = FreqSortMultiSet.create(false);
        s.add("Hello");
        s.add("Hello");
        s.setCount("World", 3);
        s.setCount("Bye", 5);
        System.out.println(s.entrySet());
    }*/

}
sebkur
  • 658
  • 2
  • 9
  • 18
Emil
  • 13,577
  • 18
  • 69
  • 108
  • +1. Elegant, in a way, to have a Multiset implementation which includes and abstracts away all the sorting logic. Client code using this would remain simple (no changes needed to example code in my question as long as FreqSortMultiSet was used). Drawback, of course, is having to write & maintain somewhat more code overall than in [S.P.Floyd's solution](http://stackoverflow.com/questions/4345633/simplest-way-to-iterate-through-a-multiset-in-the-order-of-element-frequency/4346245#4346245)... – Jonik Dec 07 '10 at 19:01
2

Since it is not yet implemented, I guess you can create a Map with key=type and value=count. Then sort that map - see here

Community
  • 1
  • 1
Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140