1

I just had a homework assignment that wanted me to add all the Java keywords to a HashSet. Then read in a .java file, and count how many times any keyword appeared in the .java file.

The route I took was: Created an String[] array that contained all the keywords. Created a HashSet, and used Collections.addAll to add the array to the HashSet. Then as I iterated through the text file I would check it by HashSet.contains(currentWordFromFile);

Someone recommended using a HashTable to do this. Then I seen a similar example using a TreeSet. I was just curious.. what's the recommended way to do this?

(Complete code here: http://pastebin.com/GdDmCWj0 )

Hristo
  • 45,559
  • 65
  • 163
  • 230
snowBlind
  • 955
  • 2
  • 10
  • 19

2 Answers2

2

Try a Map<String, Integer> where the String is the word and the Integer is the number of times the word has been seen.

One benefit of this is that you do not need to process the file twice.

TofuBeer
  • 60,850
  • 18
  • 118
  • 163
  • It seems that way would make it easier to count specific keywords individually. Do you think there is a downside to the way I did it, given I don't need to count each keyword individually? – snowBlind Apr 27 '11 at 06:10
  • Start the Map containing just the keywords and a value of 0 for each one. Call Map.get to get the value, if it returns a non-null value then increment it and re-store. If it was null then there is nothing to do because it wasn't a keyword. – TofuBeer Apr 27 '11 at 14:20
1

You said "had a homework assignment" so I'm assuming you're done with this.

I would do it a bit differently. Firstly, I think some of the keywords in your String array were incorrect. According to Wikipedia and Oracle, Java has 50 keywords. Anyway, I've commented my code fairly well. Here's what I came up with...

import java.io.BufferedReader;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.util.Map;
import java.util.HashMap;

public class CountKeywords {

    public static void main(String args[]) {

        String[] theKeywords = { "abstract", "assert", "boolean", "break", "byte", "case", "catch", "char", "class", "const", "continue", "default", "do", "double", "else", "enum", "extends", "false", "final", "finally", "float", "for", "goto", "if", "implements", "import", "instanceof", "int", "interface", "long", "native", "new", "null", "package", "private", "protected", "public", "return", "short", "static", "strictfp", "super", "switch", "synchronized", "this", "throw", "throws", "transient", "true", "try", "void", "volatile", "while" };

        // put each keyword in the map with value 0 
        Map<String, Integer> theKeywordCount = new HashMap<String, Integer>();
        for (String str : theKeywords) {
            theKeywordCount.put(str, 0);
        }

        FileReader fr;
        BufferedReader br;
        File file = new File(args[0]);

        // attempt to open and read file
        try {
            fr = new FileReader(file);
            br = new BufferedReader(fr);

            String sLine;

            // read lines until reaching the end of the file
            while ((sLine = br.readLine()) != null) {

                // if an empty line was read
                if (sLine.length() != 0) {

                    // extract the words from the current line in the file
                    if (theKeywordCount.containsKey(sLine)) {
                        theKeywordCount.put(sLine, theKeywordCount.get(sLine) + 1);
                    }
                }
            }

        } catch (FileNotFoundException exception) {
            // Unable to find file.
            exception.printStackTrace();
        } catch (IOException exception) {
            // Unable to read line.
            exception.printStackTrace();
        } finally {
                br.close();
            }

        // count how many times each keyword was encontered
        int occurrences = 0;
        for (Integer i : theKeywordCount.values()) {
            occurrences += i;
        }

        System.out.println("\n\nTotal occurences in file: " + occurrences);
    }
}

Every time I encounter a keyword from the file, I first check if its in the Map; if it isn't, its not a valid keyword; if it is, then I update the value the keyword is associated with, i.e., I increment the associated Integer by 1 because we've seen this keyword once more.

Alternatively, you could get rid of that last for loop and just keep a running count, so you would instead have...

if (theKeywordCount.containsKey(sLine)) {
    occurrences++;
}

... and you print out the counter at the end.

I don't know if this is the most efficient way to do this, but I think its a solid start.

Let me know if you have any questions. I hope this helps.
Hristo

Hristo
  • 45,559
  • 65
  • 163
  • 230
  • Hristo, before I look through all of your code, yes the homework assignment is completed. Also, as to why I have more than 50 keywords, the homework assignment specified we should also include the 3 reserved words; false, null, and true.. which I forgot to mention. Thanks for the post.. I'm going to read through and see the way you did things now. I definite appreciate seeing the approach someone will far more programming experience would approach the assignment. – snowBlind Apr 27 '11 at 06:32
  • 1. gotcha. I'll add those to my list. 2. I don't have "far more programming experience". I'm still a college student :) 3. good luck! let me know what you think and if you have questions. In the meantime, I'm going to get some sleep. – Hristo Apr 27 '11 at 06:37
  • br.close belongs into a finally-statement. java.util.Scanner seems far less complicated to use than br(fr(file)). The comments are noise (`FileNotFoundException exception) {// Unable to find file.` or `// Unable to read line`. Instead of `if (a == 0) {/*empty*/} else ...` just write `if (a != 0) { ... `. – user unknown Apr 27 '11 at 08:23
  • thanks for your suggestions. regarding `Scanner`, in my experience its slower than using `FileReader` and `BufferedReader` – Hristo Apr 27 '11 at 14:11