I'm just beginning an assignment on Huffman encoding. The first step is to implement some form of file handling that will read in the file to be processed and then perform frequency counting of the characters.
I have several different text files to test this against - some are letters, numbers, symbols, uppercase, lowercase etc.
Here is what I have so far:
import java.io.*;
public class LetterFrequency {
int nextChar;
char c;
public static void main(String[] args) throws IOException {
File txtfile = new File("10000random.txt");
BufferedReader in = new BufferedReader (new FileReader (txtfile));
System.out.println("Letter Frequency:");
int[] count = new int[26];
while ((nextChar = in.read()) != -1) {
ch = ((char) nextChar);
if (ch >= 'a' && ch <= 'z')
count[ch - 'a']++;
}
for (int i = 0; i < 26; i++) {
System.out.printf("%c %d", i + 'A', count[i]);
}
in.close();
}
This is obviously a basic version (just handling a-z), how would I change this to include all uppercase letters, numbers, symbols etc. Doesn't seem right to have to guess the size of the array.
Apologies if this is an obvious question, I'm still learning! Thank you