5

Lest's say I have string:

 String test= "AA BB CC BB BB CC BB";

What I would like to do is create String array like this:

 String[]{"BB", "CC", "AA"}

Since B occurred 4 times C did 2 times and A only 1 time.

What would solution for this problem look like?

Community
  • 1
  • 1
MatBanik
  • 26,356
  • 39
  • 116
  • 178
  • Possible duplicate: http://stackoverflow.com/questions/6712587/counting-frequency-of-characters-in-a-string – rajah9 Jul 29 '11 at 18:35
  • 2
    not exactly duplicate. This is the 1st part only. – Bozho Jul 29 '11 at 18:37
  • do you mean `String[]{"B", "C", "A"}` ? – Eng.Fouad Jul 29 '11 at 18:38
  • 2
    possible duplicate of [Simplest way to iterate through a Multiset in the order of element frequency?](http://stackoverflow.com/questions/4345633/simplest-way-to-iterate-through-a-multiset-in-the-order-of-element-frequency) – Bozho Jul 29 '11 at 18:40
  • 1
    "Is there utility method that does something like this?" you mean in the core API? No, there is not. – Bart Kiers Jul 29 '11 at 19:00
  • "Since B occurred 4 times C did 2 times and A only 1 time." B or BB? – James P. Jul 29 '11 at 19:17
  • Ok. This sounds like something that would be applied to text. Having different weights for given words could be used as a measure of relevancy. Add combinations of words and you'd have a basic search engine. – James P. Jul 29 '11 at 20:30

4 Answers4

9
String test = "AA BB CC BB BB CC BB";
System.out.println(Arrays.deepToString(sort(test)));

Output: [BB, CC, AA]

Code:

public static String[] sort(String test) {
    String[] strings = test.split(" ");
    HashMap<String,Integer> map = new HashMap<String,Integer>();
    for (String s : strings) {
        Integer i = map.get(s);
        if (i != null) {
            map.put(s, i+1);
        } else {
            map.put(s, 1);
        }
    }

    TreeMap<Integer,String> sort = new TreeMap<Integer,String>(Collections.reverseOrder());
    for (Entry<String,Integer> e : map.entrySet()) {
        sort.put(e.getValue(), e.getKey());
    }

    return sort.values().toArray(new String[0]);
}
user802421
  • 7,465
  • 5
  • 40
  • 63
3

What you could do is something like this (rough code):

String[] myOccurences = test.split(" ");

Then:

HashMap<String,Integer> occurencesMap = new HashMap<String,Integer>()

for( String s : myOccurences ){
    if( occurencesMap.get( s ) == null ){
        occurencesMap.put(s, 1);
    } else {
        occurencesMap.put(s, occurencesMap.get(s)++ );
    }
}

Edit: The actual sorting (again rough code and unchecked):

List<String> mapKeys = new ArrayList<String>(occurencesMap.keySet()); // Keys
List<Integer> mapValues = new ArrayList<Integer>(occurencesMap.values()); // Values

TreeSet<Integer> sortedSet = new TreeSet( mapValues ); // Sorted according to natural order                
Integer[] sortedValuesArray = sortedSet.toArray();

HashMap<String,Integer> lhMap = new LinkedHashMap<String,Integer>(); // LinkedHashMaps conserve order

for (int i=0; i<size; i++){
    lhMap.put(mapKeys.get(mapValues.indexOf(sortedArray[i])), sortedValuesArray[i]);
}

mapKeys = new ArrayList<String>(occurencesMap.keySet()); // Keys again, this time sorted
Collections.sort(mapKeys, Collections.reverseOrder()); // Reverse since original ascending

String[] occurencesSortedByDescendingArray = mapKeys.toArray();

Feel free to comment.

James P.
  • 19,313
  • 27
  • 97
  • 155
  • 1
    This just produces a map with the counts for each token, it doesn't then produce the sorted `String` array based on those counts. – Daniel DiPaolo Jul 29 '11 at 18:55
  • +1, I'm out of votes though. I was writing this up (in pseudocode, as I haven't done anything in Java in a few months) and checked just before submitting my answer, saw this. Great solution. – Sam DeHaan Jul 29 '11 at 18:55
  • It's missing the sort. Just noticed while rereading the question. Thanks for pointing this out. – James P. Jul 29 '11 at 18:56
  • Sorting code added. First time I've actually typed something out without having a compiler open :p – James P. Jul 29 '11 at 19:30
3

If you want to use Guava:

Lists.transform(
  Ordering
    .natural()
    .onResultOf(new Function<Multiset.Entry<String>, Integer>() {
        public Integer apply(Multiset.Entry<String> entry) {
          return entry.getCount();
        }
      })
    .reverse()
    .sortedCopy(
        ImmutableMultiset.copyOf( Splitter.onPattern("\\s+").split(test) ).entrySet()
      ),
  new Function<Multiset.Entry<String>, String>() {
    public String apply(Multiset.Entry<String> entry) {
      return entry.getElement();
    }
  }
);
Adam Paynter
  • 46,244
  • 33
  • 149
  • 164
1

I am not sure if a method exists for this exact purpose.

However, you could use the String.split() method to split the single string into an array of strings. From there, you could locate unique strings (either by manually checking or adding them all to a set, which would check for duplicates). Track (and increment a counter unique to each unique String) each time you add an element and it is not part of the collection. Then create an array that is sorted based on this count.

A map would be ideal for holding the String/count, as it would maintain the set of unique Strings as keys, and the count for each String as the value.

ty1824
  • 1,210
  • 10
  • 11
  • An answer with actual code was already posted that does exactly this and then the actual non-trivial step of taking that map and converting it to the sorted string array – Daniel DiPaolo Jul 29 '11 at 19:03
  • And it was posted while I was writing! I'm typically wary of writing code-only answers to questions, as I am a student and I know how many of my fellow students use this site. – ty1824 Jul 29 '11 at 19:08
  • ty1824's answer hadn't been posted when I started to write something. Also, I missed the part about sorting which was actually the most important part and have updated. P.S: I'm kind of a student too although probably older (finishing a degree at night school). Kudos for taking the jump of attempting to answer a question like this. – James P. Jul 29 '11 at 19:44