-1

I'm writing a program that'll scan a text file in, and count the number of words in it. The definition for a word for the assignment is: 'A word is a non-empty string consisting of only of letters (a,. . . ,z,A,. . . ,Z), surrounded by blanks, punctuation, hyphenation, line start, or line end. '.

I'm very novice at java programming, and so far i've managed to write this instancemethod, which presumably should work. But it doesn't.

public int wordCount() {
    int countWord = 0;
    String line = "";
    try {
        File file = new File("testtext01.txt");
        Scanner input = new Scanner(file);

        while (input.hasNext()) {
            line = line + input.next()+" ";
            input.next();
        }
        input.close();
        String[] tokens = line.split("[^a-zA-Z]+");
        for (int i=0; i<tokens.length; i++){
            countWord++;
        }
        return countWord;

    } catch (Exception ex) {
        ex.printStackTrace();
    }
    return -1;
}
techraf
  • 64,883
  • 27
  • 193
  • 198
  • 2
    What is the input file look like, and what do you output so far? – Ash Oct 14 '16 at 14:42
  • 1
    This code: for (int i=0; i – ControlAltDel Oct 14 '16 at 14:47
  • input file looks like this. its a notepad: This is line 1. This is another line, line2. It is raining, take an umbrella. It is raining. Take an umbrella! Really. Line five is short. He said: "Take an umbrella!" Another line, line... Did he take one? NO. Why not? output is currently : java.util.NoSuchElementException at java.util.Scanner.throwFor(Unknown Source) at java.util.Scanner.next(Unknown Source) at TextAnalysis16.wordCount(TextAnalysis16.java:46) at TextAnalysis16.main(TextAnalysis16.java:32) Number of words in text is: -1 howto linebreak in comments??? – Jonas Christophersen Oct 14 '16 at 14:48
  • 1
    Please edit your question to add more information instead of posting it in the comments. That way you can format it and everything is in one place, making it easier to help you. – Robert Oct 14 '16 at 15:01

4 Answers4

0

Quoting from Counting words in text file?

    int wordCount = 0;

    while (input.hasNextLine()){

       String nextLine = input.nextLine();
       Scanner word = new Scanner(nextline);

       while(word.hasNext()){
          wordCount++;    
          word.next();
       }
       word.close();
    }
    input.close();
Community
  • 1
  • 1
0

The only usable word separators in your file are spaces and hyphens. You can use regex and the split() method.

int num_words = line.split("[\\s\\-]").length; //stores number of words
System.out.print("Number of words in file is "+num_words);

REGEX (Regular Expression):

\\s splits the String at white spaces/line breaks and \\- at hyphens. So wherever there is a space, line break or hyphen, the sentence will be split. The words extracted are copied into and returned as an array whose length is the number of words in your file.

progyammer
  • 1,498
  • 3
  • 17
  • 29
0
you can use java regular expression. 
You can read http://docs.oracle.com/javase/tutorial/essential/regex/groups.html to know about group



    public int wordCount(){

        String patternToMatch = "([a-zA-z]+)";
        int countWord = 0;
        try {
        Pattern pattern =  Pattern.compile(patternToMatch);
        File file = new File("abc.txt");
        Scanner sc = new Scanner(file);
        while(sc.hasNextLine()){
            Matcher matcher = pattern.matcher(sc.nextLine());
             while(matcher.find()){
                 countWord++;
             }
        }
        sc.close();
}catch(Exception e){
          e.printStackTrace();
        }
        return countWord > 0 ? countWord : -1;
    }
Noman Khan
  • 920
  • 5
  • 12
0
void run(String path)
throws Exception
{
    try (BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream(path), "UTF-8")))
    {
        int result = 0;

        while (true)
        {
            String line = reader.readLine();

            if (line == null)
            {
                break;
            }

            result += countWords(line);
        }

        System.out.println("Words in text: " + result);
    }
}

final Pattern pattern = Pattern.compile("[A-Za-z]+");

int countWords(String text)
{
    Matcher matcher = pattern.matcher(text);

    int result = 0;

    while (matcher.find())
    {
        ++result;

        System.out.println("Matcher found [" + matcher.group() + "]");
    }

    System.out.println("Words in line: " + result);

    return result;
}
yeoman
  • 1,671
  • 12
  • 15