0

The last for loop in this class is the culprit. Where i write the mode words to the newly created array. The for loop will not iterate a final time, even though the eclipse debugger shows the value i as being lesser than (tokens.length-2). Perhaps it is a fencepost problem, but i have tried a do while loop and a bunch of stuff. Further i have posted the client code and the txt file i am using.

// This class creates an object wherein a text file is segmented and stored
// word for word in an array, facilitating a word count, the ability to check
// for the occurrence of a word and also the functionality of returning the 
// most frequently occurring words.

import java.io.*;
import java.util.Arrays;
import java.util.Scanner;

public class TextAnalysis14 {
    private String[] tokens;
    int maxNoOfWords;

    // Constructor that loads a file and assigns each word to an array index
    public TextAnalysis14 (String sourceFileName, int maxNoOfWords) throws FileNotFoundException{
        this.maxNoOfWords = maxNoOfWords;
        Scanner in = new Scanner(new FileReader(sourceFileName));
        String file = in.useDelimiter("\\Z").next();
        this.tokens = file.split ("[^a-zA-Z]+");
        in.close();
    }
    // Returns the number of words in the file.
    public int wordCount(){

        return tokens.length;

    }
    // Checks whether " word " is a word in the text.
    public boolean contains(String word){
        for(int i=0; i<tokens.length; i++){
            if(tokens[i].equalsIgnoreCase(word)){return true;}
        }
        return false;
    }
    // Returns the most frequent word(s) in lexicographical order.
    public String [] mostFrequentWords(){
        Arrays.sort(tokens);
        //Finds the mode word occurrence
        int wordValue=1;
        int maxValue=1;
        for(int i=0; i<tokens.length-2; i++){
            while(tokens[i].equalsIgnoreCase(tokens[i+1])){
                wordValue++;
                i++;
            }
            if(wordValue>maxValue){
                maxValue=wordValue;
            }
            wordValue=1;
        }
        //Determines length of return array 
        int numberOfModes=1;
        for(int i=0; i<tokens.length-2; i++){
            while(tokens[i].equalsIgnoreCase(tokens[i+1])){
                wordValue++;
                i++;
            }
            if(wordValue==maxValue){
                numberOfModes++;
            }
            wordValue=1;
        }
        //writes mode words to array
        int cursor =0;
        String[] modeWords = new String[numberOfModes];
        for(int i=0; i<tokens.length-2; i++){
            while(tokens[i].equalsIgnoreCase(tokens[i+1])){
                wordValue++;
                i++;
            }
            if(wordValue==maxValue){
                modeWords[cursor]=tokens[i];
                cursor++;
            }
            wordValue=1;
        }
        return modeWords;
    }

}

The following is my client code:

import java.io.FileNotFoundException;
import java.util.Arrays;

public class TextAnalysis_test01 {

    public static void main(String[] args) throws FileNotFoundException {

        TextAnalysis14 ta14 = new TextAnalysis14("testtext01.txt", 100);
        System.out.println(ta14.wordCount());
        System.out.println(ta14.contains("Bla"));
        System.out.println(ta14.contains("hello"));
        System.out.println(Arrays.toString(ta14.mostFrequentWords()));


    }

}

The following is the content of my txt file:

bla bla
dim dim 
dum dum

And i get the output:

6
true
false
[bla, dim, null]

As is evident i'm not writing anything to the last index in the returned string array, as far as i can tell, because the final for loop is not iterating a last time. The part in my class commented with: //writes mode words to array.

Any help or advice would be a godsend. Cheers!

Mr. A
  • 37
  • 4

1 Answers1

0

I would really advise you to refactor your code in an attempt to improve it but if you are only after a fix, the following change should give the output expected:

int cursor =0;
    String[] modeWords = new String[numberOfModes];
    for(int i=0; i<=tokens.length-2; i++){
        while(i+1 <= tokens.length-1 && tokens[i].equalsIgnoreCase(tokens[i+1])){
            wordValue++;
            i++;
        }
        if(wordValue==maxValue){
            modeWords[cursor]=tokens[i];
            cursor++;
        }
        wordValue=1;
    }
    return modeWords;
}
Iootu
  • 344
  • 1
  • 6
  • 1
    Would you explain how that corrects the behavior? I'm not saying it's wrong, I haven't tried, but the code isn't self explanatory, so I don't think the code itself is an answer. – Silly Freak Oct 26 '14 at 22:03
  • As I stated in the answer, the whole code is a bit of a mess and as such I stressed out the urgent need of a refactor. There are certainly better ways, for readability sake, to achieve what he desires. Regarding the explanation, the change allows to visit the last 2 positions of the array: to do so I changed the condition of the for loop from i < tokens.length-2 to i <= tokens.length-2, but this now poses a new problem because of the way his code is structured (i being incremented in more than 1 place), if there is no change in the while condition, you will end up with an exception. – Iootu Oct 26 '14 at 22:15
  • More precisely with an ArrayIndexOutOfBoundsException. That's why I added the "i+1 <= tokens.length-1" part of the condition, as this when evaluates to false, there is no need to evaluate the right side of the condition which then prevents the tokens[i+1] to throw the exception because at this point i==5 so i+1 would be 6 which is outside of bounds. I hope that you were able to understand the explanation because it wasn't easy and that's due to the complexity of the code you are viewing. If this is an answer or not is arguable, he asked for solution to write in his last index and this... – Iootu Oct 26 '14 at 22:28
  • does accomplish that. But I give absolutely no guarantee that this will work for any other input values, I only changed the section where he writes to the array and that's what was asked for. – Iootu Oct 26 '14 at 22:33
  • I guess using the same variable in a nested loop is bad practice? – Mr. A Oct 26 '14 at 22:55
  • @Sebastian Sorry if you felt offended regarding the other comments. I was merely explaining why I post the response and the mechanics behind the change I submitted it. Btw did it worked? Regarding bad practises, first I advise you to shorten the method, you do state in comments each of the tasks you are performing at a given time, why not create a separate method to handle it? That change alone would improve your code readability tremendous. – Iootu Oct 26 '14 at 23:08
  • Non taken! I have a bunch of other test files, with which it sadly did not work. I should redesign the whole thing. Also, as you were saying readability is pretty shabby and there seems to be a lot of redundancy. – Mr. A Oct 26 '14 at 23:21
  • Ah, think I missed the <= in the loop's condition, that makes it a little more apparent. If this new code was working, by the way, you could have asked at code review for comments. Nonworking code is a no go there, though – Silly Freak Oct 28 '14 at 14:25