1

I have a string say "aaabbacccd" here count[a]=4 count[b]=2 count[c]=3 and count[d]=1 . I have to find character with nth largest frequency. In the above string the character with 3rd highest frequency is b (because 1<2<3<4) and count[b]=2

The straight forward solution is storing character Vs frequency in a map and sorting the values using sorting collection by value method like below

public class MapUtil {
public static <K, V extends Comparable<? super V>> Map<K, V> 
    sortByValue(Map<K, V> map) {
    List<Map.Entry<K, V>> list = new LinkedList<Map.Entry<K, V>>(map.entrySet());
    Collections.sort( list, new Comparator<Map.Entry<K, V>>() {
        public int compare(Map.Entry<K, V> o1, Map.Entry<K, V> o2) {
            return (o1.getValue()).compareTo( o2.getValue() );
        }
    });

    Map<K, V> result = new LinkedHashMap<K, V>();
    for (Map.Entry<K, V> entry : list) {
        result.put(entry.getKey(), entry.getValue());
    }
    return result;
}

}

I tried solving this problem using tree-map to maintain the characters in sorted order based on the count. But all i ended up with was violating the equality and compare to constraint thus my map ending with inconsistent values.

So can't this problem be solved using Tree-map or any other data structure in an optimal way ?

Aarish Ramesh
  • 6,745
  • 15
  • 60
  • 105
  • Look up ["order statistics in linear time"](https://stackoverflow.com/q/251781/335858). – Sergey Kalinichenko Sep 20 '17 at 19:29
  • If I understand correctly, you want k'th smallest element in the array. There are algorithms which allow you to do that without sorting the array. [This one](https://en.wikipedia.org/wiki/Median_of_medians) should work for larger arrays in linear time which is an improvement over sorting. – jrook Sep 20 '17 at 19:29
  • @dasblinkenlight This problem has to find out kth largest frequency. Don't get how it is related to order statistics which is for unsorted array of integers – Aarish Ramesh Sep 20 '17 at 19:53
  • 2
    @Aarish Right. You start by creating an unsorted array of integers representing character counts. Then you compute k-order stat, and finally you find the character that has the corresponding count. – Sergey Kalinichenko Sep 20 '17 at 19:57

11 Answers11

1

Here is a streaming solution:

import java.util.Collections;
import java.util.Map;
import java.util.function.Function;
import java.util.stream.Collectors;


public class StackOverflow {

  private static class S46330187 {
    public static void main(String[] args) {
      //prints b
      System.out.println(kthMostFrequentChar("aaabbacccd",3));
      //prints b
      System.out.println(kthMostFrequentChar("aaabbacccbbbd",1));
      //prints e
      System.out.println(kthMostFrequentChar("aaabbbcccdddeee",5));

    }

    private static Character kthMostFrequentChar(final String string, final int kth) {
      Map<Integer, Long> counts = string.chars()
          .boxed()
          .collect(Collectors.groupingBy(
              Function.identity(),
              Collectors.counting()
          ));
      return counts.entrySet()
          .stream()
          .sorted(Collections.reverseOrder(Map.Entry.comparingByValue()))
          .map(e->(char)e.getKey().intValue())
          .limit(kth)
          .reduce((l,r)->r)
          .orElseThrow(IllegalArgumentException::new);
    }

  }
}
Andreas
  • 4,937
  • 2
  • 25
  • 35
  • it will not pass competative test. Use hash map of char to frequency and then apply order statistics algorithm to find kth largest in that reverse map. – donald Jul 28 '18 at 04:52
  • It will fail in scenario where multiple characters have same frequencies. //Should print c but will print b. |System.out.println(kthMostFrequentChar("aabcb", 2));| – Rayon Jun 25 '19 at 20:10
1

Using a TreeMap won't be of much help because it is sorted on the keys, not on values.

Here's my simple algorithm:

  1. Gather the information on how many times a character appears in a given string.
  2. Store it in a map of character and frequency.
  3. Retrieve the values (frequencies) and reverse sort it.
  4. Pick the 3rd value (since you want the third highest frequency) from this reverse sorted values.
  5. Find all the keys (characters) from the map which have this frequency and print them.

Here's a working program basing this algorithm:

    String str = "aabbddddccc";
    Map<Character, Integer> map = new HashMap<>();
    while (!str.isEmpty()) {
        char ch = str.charAt(0);
        String[] arr = str.split(Character.toString(ch));
        map.put(ch, arr.length == 0 ? str.length() : arr.length - 1);
        str = String.join("", arr);         
    }
    System.out.println(map);
    List<Integer> values = new ArrayList<>(map.values().stream().sorted().collect(Collectors.toSet()));
    Collections.reverse(values);
    map.keySet().stream().filter(e -> map.get(e) == values.get(3)).forEach(System.out::println);
VHS
  • 9,534
  • 3
  • 19
  • 43
  • This won't work if there are characters with same number of counts because of this map.keySet().stream().filter(e -> map.get(e) == values.get(3)).forEach(System.out::println); – Aarish Ramesh Sep 21 '17 at 05:31
  • @Aarish. Good catch. I just fixed it to first remove the duplicate frequencies by collecting the frequencies in a set rather than a list. See my updated my answer. I think you have already found the correct the answer from others' posts. But this will still be helpful to future readers. – VHS Sep 21 '17 at 14:00
1

This comes in pretty handy during online tests as is a pretty simple solution using two maps. We first create a character-frequency Hashmap. Using this we create frequency-character treemap, inserting an element into the treemap only if has lesser value (this is to handle cases when there are multiple characters have the same frequency ). Once the treemap is made(which is already sorted in increasing order of frequency), we just print the character corresponding to the kth highest frequency.

import java.util.*;
import java.lang.*;
import java.io.*;
class Ideone
{
    public static void main (String[] args) throws java.lang.Exception{
    String s="1112229999955555777777777";
    HashMap<Character,Integer> hm= new HashMap<Character,Integer>();
    for(int i=0;i<s.length();i++){
        char cur=s.charAt(i);
        if(!hm.containsKey(cur)){
            hm.put(cur,0);
        }
        hm.put(cur,hm.get(cur)+1);
    }
    TreeMap<Integer,Character> tm= new TreeMap<Integer,Character>();

    for(Map.Entry<Character,Integer> entry:hm.entrySet()){
        int x=entry.getValue();
        char cur=entry.getKey();
        if(!tm.containsKey(x)){
            tm.put(x,cur);
        }
        else{
            if(cur<tm.get(x))
                tm.put(x,cur);
        }

    }

    int k=2; // print the kth most frequent character. If there are multiple, print the one with lesser value
    int tar=tm.size()-k;
    if(tar<0) System.out.println("-1");
    else{
    int d=0;
    for(Map.Entry<Integer,Character> entry: tm.entrySet()){

        int x=entry.getKey();
        char cur=entry.getValue();
        //System.out.println(cur+" "+x);
        if(d++==tar)  {
            System.out.println("ans- "+cur);
            break; // we have found the requried character, so break
        }
    }

    }

}
}
1
static String getNthlargestFrequency(String str, int K) {
    char[] charArr = str.toCharArray();     
    String result = "-1"; //if nth frequency doesn't exist
    Map<Character, Integer> map = new HashMap<Character, Integer>();
    for (char value : charArr) {
        if (map.containsKey(value)) {
            map.put(value, map.get(value) + 1);
        } else {
            map.put(value, 1);
        }
    }
    // sort map on value basis [desc]
    System.out.println("Original map" + map);
    Map sorted = map.entrySet().stream().sorted(Collections.reverseOrder(Map.Entry.comparingByValue()))
            .collect(Collectors.toMap(Map.Entry::getValue, Map.Entry::getKey, (e1, e2) -> e2, LinkedHashMap::new));
    //get character with nth maximum frequency
    Optional<Entry<Integer, Character>> entry = sorted.entrySet().stream().skip(K - 1).findFirst();
    if (entry.isPresent()) {
        result = ""+entry.get().getValue();
    }
    return result;
}

Note:

If there are multiple characters with same nth frequency then the last character will be returned.

Rayon
  • 661
  • 8
  • 9
0

Maintain two maps.

  • First, mapping char to int (i.e. char to frequency) in a regular HashMap.
  • Second, mapping int to List (i.e. frequency to chars having that frequency) in a TreeMap.

Read all the input chars one by one and update both maps. When done with your input, go through the TreeMap to get the Lists.

Starting from max frequency,

  • If that list has less than k elements (say, equal to m) then go to next ones and look for (k - m)th element.
  • On the other hand, if the list has more than k elements, then return the first element in the list as there are multiple candidates for the kth highest occurring element.
displayName
  • 13,888
  • 8
  • 60
  • 75
0

Solution in o(n) time and constant space.

Assumption it contains only  charcters.

Pseudo code (writing from mobile , dont have IDE)

1) create a data structure of
Frequency
 Char ch,
 int v
2) create an array (frqy) of 52 length and type Frequency
2.1 initialize each of the 52 object with character a-zA-Z

3) itterate over each character (ch) in the string
3.1) check if (int)ch <=122 and >= 96
  Then  increment the frequency of  the frqy array at index (int)ch - 96

3.3) check if (int)ch <=90 and >= 65
  Then  increment the frequency of  the frqy array at index (int)ch - 65+26

4 ) sort the array based on frequency  and now you have the answer
Geek
  • 627
  • 1
  • 8
  • 18
  • The assumption of only upper- and lower-case US English letters is naive, and not something that was included in the original question. This solution is not at all reasonable given the prevalence of Unicode in today's computing world, where there are on the order of 1.3 million possible characters. – Jim Mischel Sep 20 '17 at 20:39
  • Thanks for the informative comment without it I wouldn't know the number of charcters in todays conputational world. – Geek Sep 20 '17 at 21:07
0
  1. We will use a map to count the frequency and then store the count and the character in a list of pairs.
  2. sort the list in a custom manner(based on the character count) (using comparator or lambda)
  3. Retrieve the Kth Value from the List.

    import java.util.*;
    import java.lang.*;
    import java.io.*;
    class Main{
        static class Pair{
        char first;
        int second;
        Pair(char first,int second){
            this.first=first;
            this.second=second;
        }
    }
    public static void main(String[] args) {
        TreeMap<Character,Integer> m = new TreeMap<>();    
        String s;
        int k;
        for(int i=0;i<s.length();i++){
            char ch = s.charAt(i);
            int x=0;
            if(m.containsKey(ch))
                x=m.get(ch);
            m.put(ch,x+1);
        }
        ArrayList<Pair> list = new ArrayList<Pair>();
        for(Map.Entry<Character,Integer> i : m.entrySet()){
            w.println(i.getKey() +" : "+i.getValue());
            list.add(new Pair(i.getKey(),i.getValue()));
        }
        Collections.sort(list, (a,b)->{if(a.second>=b.second) return -1; else return 1;});
        System.out.println(list.get(k-1).first);
    }
    
    }
    
  4. To ensure you have a kth Frequent element , you can use hashSet to count the number of unique frequencies and if it is possible to have the kth frequency.

live running code example : https://ideone.com/p34SDC

moovon
  • 2,219
  • 1
  • 17
  • 15
0

Below is the simple solution without using java 8

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.LinkedHashSet;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.util.Set;
import java.util.TreeMap;
class TestClass {
public static void main(String args[]) throws Exception {
    // ma("aaabbacccd",3);
    // ma("aaabbacccbbbd",4);
    ma("aabbcd", 1);
}

private static void ma(String s, int k) {
    TreeMap<Character, Integer> map = new TreeMap<>();
    for (int i = 0; i < s.length(); i++) {
        if (map.containsKey(s.charAt(i))) {
            map.put(s.charAt(i), map.get(s.charAt(i)) + 1);
        } else {
            map.put(s.charAt(i), 1);
        }
    }
    System.out.println(map.toString());
    Set<Entry<Character, Integer>> set = map.entrySet();
    List<Entry<Character, Integer>> list = new ArrayList<>(set);
    Collections.sort(list, new Comparator<Map.Entry<Character, Integer>>() {
        public int compare(Map.Entry<Character, Integer> o1, Map.Entry<Character, Integer> o2) {
            return o2.getValue().compareTo(o1.getValue());
        }
    });
    Set<Integer> ls = new LinkedHashSet<>();
    System.out.println("after sorting by value-----");
    for (Entry<Character, Integer> e : list) {
        System.out.println("key: " + e.getKey() + " value: " + e.getValue());
        ls.add(e.getValue());
    }
    System.out.println("removed duplicate from value array --------" + ls.toString());
    System.out.println(new ArrayList<>(ls).get(k - 1));
    for (Entry<Character, Integer> e1 : list) {
        if (e1.getValue().equals(new ArrayList<>(ls).get(k - 1))) {
            System.out.println(e1.getKey());
            break;
        }
    }
}
Jérémie Bertrand
  • 3,025
  • 3
  • 44
  • 53
user3817378
  • 119
  • 10
0
String str = scanner.next();
int k = scanner.nextInt();
HashMap <Character, Integer> hash = new HashMap();
int len = str.length();
Set <Integer> set = new TreeSet < > ();
for (int i = 0; i < len; i++) {
  Integer count = hash.get(str.charAt(i));
  if (count == null) {
    count = 0;
  }
  hash.put(str.charAt(i), count + 1);
} //iterate the entire map
for (Map.Entry < Character, Integer > h: hash.entrySet()) {
  set.add(h.getValue());
}
int length = set.size();
if (k > length) {
  System.out.println("-1");
} else {
  System.out.println(set);
  Iterator < Integer > it = set.iterator();
  int j = 0;
  Integer integer = null;
  while (it.hasNext() && j <= length - k) {
    integer = it.next();
    j++;
  }
  int x = integer;
  for (Map.Entry < Character, Integer > h: hash.entrySet()) {
    if (h.getValue() == integer) {
      System.out.println(h.getKey());
      break;
    }
  }
}
FZs
  • 16,581
  • 13
  • 41
  • 50
  • 2
    Please do not post only code as an answer. Include an explanation what your code does and how it solves the problem of the question. – Mark Rotteveel Aug 03 '19 at 08:42
0

Solved Using HashMap and Priority Queue using custom Class.

import java.util.Comparator;
import java.util.HashMap;
import java.util.PriorityQueue;

public class kth_largest_in_a_string {

    static class myclass {
        int count;
        char ch;

        myclass(int count, char ch) {
            this.count = count;
            this.ch = ch;
        }
    }

    public static Character helper(String input, int k) {
        HashMap<Character, Integer> map = new HashMap<>();
        for (int i = 0; i < input.length(); i++) {
            if (!map.containsKey(input.charAt(i))) {
                map.put(input.charAt(i), 1);
            } else {
                map.put(input.charAt(i), map.get(input.charAt(i)) + 1);
            }
        }
        PriorityQueue<myclass> pq = new PriorityQueue<>(input.length(), new Comparator<myclass>() {
            public int compare(myclass a, myclass b) {
                return b.count - a.count;
            }
        });
        for (HashMap.Entry<Character, Integer> entry : map.entrySet()) {
            pq.add(new myclass(entry.getValue(), entry.getKey()));
        }
        for (int i = 0; i < k - 1; i++)
            pq.remove();
        myclass ans = pq.remove();
        return ans.ch;
    }

    public static void main(String[] args) {
        System.out.println(helper("aaabbacccd", 3));
    }
}
Kundan
  • 1
0
public class MyStringDemo {

    public static void main(String[] args) {
        try {
            Scanner scanner = new Scanner(System.in);
            System.out.print("Enter Test Case Numbers: ");
            int T = Integer.parseInt(scanner.nextLine());
            int testCases[] = new int[T];
            
            for (int i=0 ; i < testCases.length; i++) {
                System.out.print("Enter String: ");
                String str = scanner.nextLine();
                System.out.print("Enter Kth String Number: ");
                int K = Integer.parseInt(scanner.nextLine());
                String result = getKthFrequency(str, K);
                System.out.println(result);
            }
        }catch (Exception e) {
            e.printStackTrace();
        }
    }

    static String getKthFrequency(String str, int K) {
        HashMap<Character, Integer> hp = new HashMap<Character, Integer>();
        char chs[] = str.toCharArray();
        
        for (char c : chs) {
            if (hp.containsKey(c)) {
                hp.put(c, hp.get(c) + 1);
            } else {
                hp.put(c, 1);
            }
        }
        char ch = 'z';
        boolean flag = false;
        for (Map.Entry<Character, Integer> val : hp.entrySet()) {
            if (K == val.getValue()) {
                if (ch > val.getKey()) {
                    ch = val.getKey();
                    flag = true;
                }
                return ch + "";
            }
            flag = false;
        }
        return flag == true ? ch + "" : "-1";
    }
}
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
  • Welcome to StackOverflow! While this answer may solve the OP's issues, it's recommended to provide an explanation about the written code to make your answer more understandable for the community and also to improve its quality (fix your code bounds, too) – xKobalt Aug 25 '20 at 08:03