1

I need a data structure in Java that can manipulate Strings, compute the frequency of each word in an ArrayList<String> and then I need to sort them based on the frequencies.

To put it simply, the data structure needs to be an associative array, that can be sorted BY VALUES , I already put the lines into the HashMap and got surprised by the fact that it can't be sorted , now I'm stuck thinking about another data structure.

P.S. (using two lists is not suited to my program because it needs to do a lot of calculations, so it would be better if a single structure holds each String and its occurence instead of a list for the Strings and annother for the frequency).

EDIT: I appreciate the help, but some people are suggesting a TreeMap, so I want to specify something here: I need the structure sorted by the occurences of the strings (in the case of Maps it would be the values not the keys).

Gherbi Hicham
  • 2,416
  • 4
  • 26
  • 41
  • 1
    Use TreeMap instead of HashMap – Arjit May 27 '15 at 08:45
  • 2
    well, imho OP is giving the question quite clear: He want an associative array with String as key and frequency as value, for which it will be sorted base on the frequency. Quite interesting to me actually – Adrian Shum May 27 '15 at 08:47
  • @hemena314 My code uses a hash map which can't be sorted , that's why i didn't show it,i think i made it very clear what i'm trying to do (counting the occurence of strings and then sort based on the occurences ) – Gherbi Hicham May 27 '15 at 08:51
  • @hamena i only need guidance for a data structure that can be sorted ...also i'm not someone who just wants his work done for him here , i don't want programming guidance because my question is conceptual ...but thanks anyway – Gherbi Hicham May 27 '15 at 08:52
  • who said hashmap can't be sorted? – nafas May 27 '15 at 09:04
  • Possible duplicate of [Sort a Map by values (Java)](http://stackoverflow.com/questions/109383/sort-a-mapkey-value-by-values-java) – shmosel Jun 23 '16 at 00:33

5 Answers5

4

HashMap isn't sorted, actually and shouldn't be so. If you want to have entries sorted, you may use one of SortedMap implementations, for example, TreeMap.

TreeMap has a constructor, which helps you in case you have non-standard Comparator (for example, if you want natural sorting for Strings):

TreeMap(Comparator<? super K> comparator)

UPD: I missed the point, that you need to sort entries by value.

In this case, I don't see any solution, except for the one, in which you'll have to sort entries only several times, but not to keep this state.

You may use any Map, for example, stay with HashMap, but then, before processing, you may sort the entries:

Set<Map.Entry<String, Integer>> entries = map.entrySet();
Set<Map.Entry<String, Integer>> sorted = new TreeSet<>(
        Comparator.comparingInt(Map.Entry::getValue).reversed()); // it's Java 8, but you may extract this lambda
sorted.addAll(entries);
for (Map.Entry<String, Integer> entry: sorted) {
    //...
    // the entries will be sorted by value
}

To be precise, you can't you any kind of Map to maintain entries sorted this way, because order of keys is set only once and you can't change it, because of:

  1. This is not-conventional, Comparator/compareTo operator should give the same result over the run (that's why mutable classes are not appreciated in Maps)
  2. This is not expected to give you some appreciable result, the keys are not re-sorted generally.
Dmitry Ginzburg
  • 7,391
  • 2
  • 37
  • 48
3

I don't think there is a simple data structure for this.

The frequencies are changing when you are collecting the frequency data. For which sorting should comes after the all string frequencies are collected.

The easiest way I can think of is:

// psuedo-code
final Map<String, Integer> stringFreq = ....; // it doesn't matter what kind of impl you use

// collect the String vs frequency in stringFreq

Map<String, Integer> result = new TreeMap<String, Integer>(stringFreq, 
        new Comparator<String> {
        @Override
            public int compare(String a, String b) {
                int aFreq = stringFreq.get(a);
                int bFreq = stringFreq.get(b);
                return (aFreq==bFreq)?a.compareTo(b) : (aFreq-bFreq);
            }
        });


// result should have data sorted by frequency, and then the string value
Adrian Shum
  • 38,812
  • 10
  • 83
  • 131
1

Java has a SortedMap interface with two implementations. The easiest one being TreeMap

Sharon Ben Asher
  • 13,849
  • 5
  • 33
  • 47
  • 1
    Please read again what OP is asking for: TreeMap is going to sort base on the key, but OP is asking for something that will sort base on the value (frequency) – Adrian Shum May 27 '15 at 08:48
  • That is not specified in the question. The frequency could be the key? – Sharon Ben Asher May 27 '15 at 08:50
  • 2
    it is mentioned in the question. Please read it carefully :) `compute the frequency of each word in an ArrayList(of Strings) and then i need to sort it based on the frequencies`. Using frequency as key is not going to work too (reason is obvious) – Adrian Shum May 27 '15 at 08:52
  • sharonbon , thanks for the help , but i'm afraid Adrian Shum understands what i'm saying here , i need something equivalent to HashMapthat can be sorted by values . – Gherbi Hicham May 27 '15 at 09:16
  • now, that you edited your question, its much clearer. don't be surprised why it got downvoted in its previous form – Sharon Ben Asher May 27 '15 at 09:22
1

Another solution, using a custom bean and a simple list.

1/ Define your custom bean

public class StringOccurence {
  String string ;
  int occurrence ;
}

2/ Create a comparator

public class StringOccurrenceComparator implements Comparator<StringOccurence> {
  @Override
  public int compare(StringOccurrence so1, StringOccurrence so2) {
    return Integer.compare(so1.occurrence, so2.occurrence);
  }
}

3/ Sort you list using the comparator

List<StringOccurrence> list = constructList();
Collections.sort(list, new StringOccurrenceComparator());

If you are luckily enough to use java8, here is a short version of the point 2 and 3:

List<StringOccurrence> list = constructList();
Collections.sort(list, (so1, so2) -> Integer.compare(so1.occurrence, so2.occurrence));
NiziL
  • 5,068
  • 23
  • 33
1

How about if you used a maxheap data structure to store the string and its frequency occurrence value and always kept the max value frequency at the top,then you can simply get the one with the maximum frequency at one go,but the complexity here would be to re-calculate and adjust the max heap so really depends on what kind of change you expect to see more-more number of words or highly changing frequency of words.

Laks
  • 23
  • 6