1

Sentences I guess being string that end in ! ? .

Excepting thing like Dr. Mr. It is true that you cannot really know a sentence in java because of grammar.

But I guess what I mean is a period or exclamation mark or question mark and then what follows being a capital letter.

How would one do this.

This be what I have But its not working.....

      BufferedReader Compton = new BufferedReader(new FileReader(fileName));
        int sentenceCount=0;

        String violet;

        String limit="?!.";
        while(Compton.ready())
        {
            violet=Compton.readLine();

            for(int i=0; i<violet.length()-1;i++)
            {
                if(limit.indexOf(violet.charAt(i)) != -1 && i>0 && limit.indexOf(violet.charAt(i-1)) != -1)
                {
                    sentenceCount++;
                }
            }
        }
            System.out.println("the amount of sentence is " + sentenceCount);

EDIT New way that works better

          String violet;
        while(Compton.ready())
        {
            violet=Compton.readLine();
            sentenceCount=violet.split("[!?.:]+").length;
            System.out.println("the number of words in line is " + 

              sentenceCount);
         }
james Chol
  • 53
  • 2
  • 9
  • 2
    It seems to me that your logic is not quite valid. Both your examples will be a period followed by a capital letter, because both `Mr.` and `Dr.` will be followed by the name of the person, which usually starts with a capital letter. – null Feb 02 '15 at 19:23
  • You is right null. I guess maybe I should say a sentence is a string that ends with !?. for my purposes. – james Chol Feb 02 '15 at 19:28
  • Surely someone can come up with some Idea on how to do this, but perhaps It would take a genius. – james Chol Feb 02 '15 at 19:44
  • You might consider a natural language parsing library. Or is that overkill? [OpenNLP](https://opennlp.apache.org/), [clearnlp](https://code.google.com/p/clearnlp/), [references](http://stackoverflow.com/questions/870460/is-there-a-good-natural-language-processing-library), [more references](http://stackoverflow.com/questions/22904025/java-or-python-for-natural-language-processing). – showdev Feb 02 '15 at 23:52

3 Answers3

3
BufferedReader reader = new BufferedReader(new FileReader(fileName));
int sentenceCount = 0;
String line;
String delimiters = "?!.";

while ((line = reader.readLine()) != null) { // Continue reading until end of file is reached
    for (int i = 0; i < line.length(); i++) {
        if (delimiters.indexOf(line.charAt(i)) != -1) { // If the delimiters string contains the character
            sentenceCount++;
        }
    }
}

reader.close();
System.out.println("The number of sentences is " + sentenceCount);
javac
  • 2,431
  • 4
  • 17
  • 26
1

One liner:

int n = new String (Files.readAllBytes(Paths.get(path))).split ("[\\.\\?!]").length

Uses Java 7 constructs to read whole file to byte array, create a string from that and split into sentence array then gets the length of the array.

Lev Kuznetsov
  • 3,520
  • 5
  • 20
  • 33
0

A potential way to do this is to scan your file as words and then count words that are not in your exception list that end in your given punctuation.

Here's a possible implementation using Java 8 streams:

List<String> exceptions = Arrays.toList("Dr.", "Mr.");
Iterable<String> iterableScanner = () -> new Scanner(filename);
int sentenceCount = StreamSupport.stream(iterableScanner, false)
    .filter(word -> word.matches(".*[\\.\\?!]))
    .filter(word -> !exceptions.contains(word))
    .count();
sprinter
  • 27,148
  • 6
  • 47
  • 78