Counting Character Frequency in String(Java)

Question

So since i was assigned to do a question of finding the frequency of the character in the string Here are the example from the geeksforgeeks but i couldn't understand what it is doing? So i need help someone to explain it to me.

Input : geeksforgeeks
Output :
Number of Occurrence of g is:2
Number of Occurrence of e is:4
Number of Occurrence of k is:2
Number of Occurrence of s is:2
Number of Occurrence of f is:1
Number of Occurrence of o is:1
Number of Occurrence of r is:1

Here is the code


class NoOfOccurenceOfCharacters { 
    static final int MAX_CHAR = 256; 
  
    static void getOccuringChar(String str) 
    { 
        // Create an array of size 256 i.e. ASCII_SIZE 
        int count[] = new int[MAX_CHAR]; 
  
        int len = str.length(); 
  
        // Initialize count array index 
        for (int i = 0; i < len; i++) 
            count[str.charAt(i)]++; 
  
        // Create an array of given String size 
        char ch[] = new char[str.length()]; 
        for (int i = 0; i < len; i++) { 
            ch[i] = str.charAt(i); 
            int find = 0; 
            for (int j = 0; j <= i; j++) { 
  
                // If any matches found 
                if (str.charAt(i) == ch[j])  
                    find++;                 
            } 
  
            if (find == 1)  
                System.out.println("Number of Occurrence of " + 
                 str.charAt(i) + " is:" + count[str.charAt(i)]);             
        } 
    } 
    public static void main(String[] args) 
    { 
        Scanner sc = new Scanner(System.in); 
        String str = "geeksforgeeks"; 
        getOccuringChar(str); 
    } 
}

The output

Number of Occurrence of g is:2
Number of Occurrence of e is:4
Number of Occurrence of k is:2
Number of Occurrence of s is:2
Number of Occurrence of f is:1
Number of Occurrence of o is:1
Number of Occurrence of r is:1

What does count[str.charAt(i)]++ actually do? I am confused at this part, please anyone explain it to me?

And why is there find = 0?

Starting from initialize count array index does that mean the frequency of the character is increased by 1? — Anonymous, Dec 24 '19 at 09:32
count[str.charAt(i)]++ is incrementing the value at index represented by str.charAt(i). Basically, each character in string is referred as an index in count array. So, when 'e' is found again, the value at index of 'e' within the count array is incremented. chars can be held in ints as per rules in Java primitives. — S B, Dec 24 '19 at 09:38
I think i got it for count[str.charAt(i)]++ part, then why do i need the find = 0 and another for loop with 'j' variable? to check whether it is a duplicated character? — Anonymous, Dec 24 '19 at 09:41
Does this answer your question? [How do I count the number of occurrences of a char in a String?](https://stackoverflow.com/questions/275944/how-do-i-count-the-number-of-occurrences-of-a-char-in-a-string) — Mario Codes, Sep 03 '20 at 10:18

score 3 · Accepted Answer · answered Dec 24 '19 at 09:51

3

Well, count is an int[] with 256 slots:

int count[] = new int[MAX_CHAR]; // MAX_CHAR is 256

Your algorithm defines MAX_CHAR = 256, because it assumes the string consists only of 8-Bit ASCII characters.

[0, 0, ..., 0, 0] // 256 slots

Now you're iterating each character in string str and cast it to an integer (see type casting of primitives in Java). An A will be casted to 65 (ASCII table), a B to 66 and so on. The casted int is the slot to increment. So a string A would lead to an increment of the integer at index 65. Your question was primarily about

count[str.charAt(i)]++

That translates to this:

char c = str.charAt(i);    // c = A
int index = c;             // c = A, casted to an int = 65
count[index]++             // increments the int at position 65

Result:

[0, 0, ..., 1, ..., 0, 0]
            ^ index 65

The next A would increment the int at index 65 again:

[0, 0, ..., 2, ..., 0, 0]
            ^ index 65

answered Dec 24 '19 at 09:51

steffen

16,138
4
42
81

Thanks, this make me understand easily now. How about the find = 0? – Anonymous Dec 24 '19 at 10:16
Well the algorithm is not very clever, because it has three loops each with a complexity of `n`, and one of the loops is nested. A better one would loop only once over the length (in `O(n)`) and a second one over the 256-size histogram (which is `O(1)`). To answer your question: That `find = 0` part basically iterates the string again and prints the histogram result value, *if* the loop encountered the character for the first time (`for ...; j <= i; ...`) in order to print the value only once and in occurrence order. – steffen Dec 24 '19 at 10:24
@Anonymous If you may use other data structures, try this: `Map occurrences = new LinkedHashMap<>(); str.chars().forEach(c -> occurrences.merge((char) c, 1, Integer::sum));` This also tracks occurrence order, but it's much faster and not limited to ASCII chars. – steffen Dec 24 '19 at 10:48
That would be `O(n)` instead of `O(n²)` and the result map contains only those characters that were actually found in the string. – steffen Dec 24 '19 at 10:50

score 1 · Answer 2 · edited Sep 03 '20 at 13:25

1

Use HashMap to store characters and frequency in String loop through each character in String

For Example:

(charCount.containsKey(arr1[i])) {
            charCount.put(arr1[i], charCount.get(arr1[i]) + 1);

edited Sep 03 '20 at 13:25

Mario Codes

689
8
15

answered Sep 02 '20 at 16:37

CuriosCoder

91
1
9

score 0 · Answer 3 · answered Dec 24 '19 at 09:49

So, count[str.charAt(i)]++ includes a lot of different stages and can be simplified as follow for better understanding:

char currentChar = str.charAt(i); // getting the current char to in the string
int value = count[currentChar]; // Get the current counter value
value = value + 1; // Increasing the current appearnce of this character by one
count[currentChar] = value; // Update the counter value

Basically your code assumes that you're using ASCII string (255 different characters possible) and counting each one of them.

score 0 · Answer 4 · answered Dec 24 '19 at 10:32

It looks as if the intended goal was to print out the duplicated characters in the order of their first appearance in the given string. In order to do that, we're stepping through str from left to right with

for (int i=0; i < len; ++i) {

The array ch, the same length as str, is being used to keep track of the characters of str that we've examined so far:

    ch[i] = str.charAt(i);

Next, we count the number of times the character str.charAt(i) occurs in ch[0]..ch[i], accumulating the count in find:

   int find = 0;
     //...
   for (int j = 0; j <= i; j++) { 
        if (str.charAt(i) == ch[j])  
            find++;                 
   }

If find == 1, that means we're encountering the character str.charAt(i) for the first time, and we should print out its frequency:

    if (find == 1)
        System.out.println("Number of Occurrence of " + 
            str.charAt(i) + " is:" + count[str.charAt(i)]);  
}

Notice that at any given moment the characters in ch[0]..ch[i] are exactly the same as the characters str.charAt(0)..str.charAt(i), so the extra ch array isn't really necessary. we could have counted the occurrences of str.charAt(i) directly in the first i characters of str, like this:

for (int i = 0; i < str.length(); ++i){
    int find = 0;
    for (int j = 0; j <= i; ++j) {
        if (str.charAt(j) == str.charAt(i))
            ++find;
    }
    if (find == 1)
        System.out.println("Number of Occurrence of " + 
             str.charAt(i) + " is:" + count[str.charAt(i)]);             
}

Another approach is possible that doesn't require the characters to be recounted, provided you don't need the frequency counts anymore once they've been displayed. You can use the frequency counts to keep track of which characters' frequencies have and have not been printed by 1) zeroing out the the frequency count after printing it out and 2) only printing characters whose frequencies are non-zero:

for (int i = 0; i < str.length(); ++i) {
    if (count[str.charAt(i) > 0) {
        System.out.println("Number of Occurrence of " + 
             str.charAt(i) + " is:" + count[str.charAt(i)]);
        count[str.charAt(i)] = 0;
    }
}

Vishwajeet Barve · Answer 5 · 2020-09-03T14:44:58.023

Optimized way

import java.util.ArrayList;
import java.util.List;
public class HelloWorld{
        
    static final int MAX_CHAR = 256;
    static List<Character> sequence = new ArrayList<>();
    static void getOccuringChar(String s){
        int count[] = new int[MAX_CHAR];
        for(int i=0;i<s.length();i++){
            count[s.charAt(i)]++;
            if(!sequence.contains(s.charAt(i)))
                sequence.add(s.charAt(i));
                
        }
        
        for(int i=0;i<sequence.size();i++)
                System.out.println("Number of Occurrence of "+sequence.get(i)+" is:"+count[sequence.get(i)]);
    }
    
    public static void main(String arg[]){
        getOccuringChar("geeksforgeeks");
    }
}

Output is

Number of Occurrence of g is:2
Number of Occurrence of e is:4
Number of Occurrence of k is:2
Number of Occurrence of s is:2
Number of Occurrence of f is:1
Number of Occurrence of o is:1
Number of Occurrence of r is:1

Counting Character Frequency in String(Java)

5 Answers5