0

I have the following code which counts and displays the number of times each word occurs in the whole text document.

try {
    List<String> list = new ArrayList<String>();
    int totalWords = 0;
    int uniqueWords = 0;
    File fr = new File("filename.txt");
    Scanner sc = new Scanner(fr);
    while (sc.hasNext()) {
        String words = sc.next();
        String[] space = words.split(" ");
        for (int i = 0; i < space.length; i++) {
            list.add(space[i]);
        }
        totalWords++;
    }
    System.out.println("Words with their frequency..");
    Set<String> uniqueSet = new HashSet<String>(list);
    for (String word : uniqueSet) {
        System.out.println(word + ": " + Collections.frequency(list,word));
    }
} catch (Exception e) {

    System.out.println("File not found");

}

Is it possible to modify this code to make it so it only counts each occurrence once per line rather than in the entire document?

2 Answers2

1

One can read the contents per line and then apply logic per line to count the words:

   File fr = new File("filename.txt");
   FileReader fileReader = new FileReader(file);
   BufferedReader br = new BufferedReader(fileReader);

       // Read the line in the file 
       String line = null;
        while ((line = br.readLine()) != null) {
              //Code to count the occurrences of the words

        }
J_D
  • 740
  • 8
  • 17
0

Yes. The Set data structure is very similar to the ArrayList, but with the key difference of having no duplicates. So, just use a set instead. In your while loop:

while (sc.hasNext()) {
                String words = sc.next();
                String[] space = words.split(" ");
                //convert space arraylist -> set
                Set<String> set = new HashSet<String>(Arrays.asList(space));
                for (int i = 0; i < set.length; i++) {
                    list.add(set[i]);
                }
                totalWords++;
            }

Rest of the code should remain the same.

cptwonton
  • 500
  • 3
  • 16
  • Sorry I'm not familiar with "Set" could you explain a little because if I use this code "set" is not defined, thanks. – CodingIsHardMan Jan 23 '18 at 16:16
  • You're already using the exact same Set I'm talking about further down your code! Replace my comment //convert space arraylist -> set with code to convert the space arraylist into a set. You can find out how to do this by googling. To be honest, I'm suspecting you did not actually write the original code. – cptwonton Jan 23 '18 at 16:22
  • The original code is from here https://stackoverflow.com/a/19927440/8831977 however this counts every instance of a word in the whole text file, but it's okay, I will look into using Set, thanks for your help – CodingIsHardMan Jan 23 '18 at 16:38