0

For example, if I have the following lines of text in a file:

this is an example. this is an example.

this is an example. this is an example. this is an example

this is an example this is an example this is an example this is an example this is an example this is an example this is an example this is an example this is an example this is an example.

I want to be able to count these lines as 3 paragraphs. Now my code will count this as 4 paragraphs, as it does not know when a paragraph begins and ends.

Scanner file = new Scanner(new FileInputStream("../.../output.txt"));
int count = 0;
while (file.hasNextLine()) { //whilst scanner has more lines
    Scanner s = new Scanner(file.nextLine());
    if(!file.hasNext()){
        break;
    }
    else{
        file.nextLine();
        count++;
    }
    s.close();
}
System.out.println("Number of paragraphs: "+ count);
file.close();

This is what I have so far. It reads lines of text, and treats each line as a single paragraph.

I want it to treat lines of text that don't have any empty line between them as 1 paragraph and count all paragraphs in file.

femtoRgon
  • 32,893
  • 7
  • 60
  • 87
JD14
  • 39
  • 1
  • 7
  • Is is a statically formatted file? Could you just check for a tab or '\t' or blank line? – Patrick J Abare II Mar 03 '14 at 19:46
  • What have you tried so far? Show some work and/or research towards finding a solution yourself, then ask for help on the *specific* problems you can't solve by yourself. – Mar Mar 03 '14 at 19:48
  • Not clear on what you want. Are you trying to count number of words in a paragraph? If so, how do you define a begin and end of paragraph? This will give you clue on how you would go about doing it. Hint, if it is counting words you want, then Scanner.next() will pickup the next word for you. – TA Nguyen Mar 03 '14 at 20:44
  • Yes i basically want to count number of words in a paragraph that contains a specific word but before that, i want to define the beginning and end of a paragraph i.e. if there is an empty space after a line or lines of text, it means it is the end of that paragraph. my code reads every line as a paragraph. i've struggled to do this for a while. – JD14 Mar 03 '14 at 21:29

2 Answers2

0

Scanner probably isn't the best choice if you only want to count lines. BufferedReader is probably better.

    BufferedReader in = new BufferedReader(new FileReader("output.txt"));
    String line = in.readLine();
    int count = 0;
    StringBuilder paragraph = new StringBuilder();
    while (true) {
        if (line==null || line.trim().length() == 0) {
            count++;
            System.out.println("paragraph " + count + ":" + paragraph.toString());
            paragraph.setLength(0);
            if(line == null)
                break;
        } else {
            paragraph.append(" ");
            paragraph.append(line);
        }
        line = in.readLine();
    }
    in.close();
    System.out.println("Number of paragaphs: "+ count);    
Ted Bigham
  • 4,237
  • 1
  • 26
  • 31
  • but it count the lines fine without problem. i just want the system to be able to know what paragraphs are i.e. if there is space after a line/lines of text, it is the end of that paragraph. atm it treats every line of text as a paragraph – JD14 Mar 03 '14 at 21:32
  • Are you saying you want read each paragraph as a single string? If you don't care about count, then you should remove that from your example (and the title). – Ted Bigham Mar 03 '14 at 21:43
  • If you want both (count and paragraph), then just use my example plus a StringBuilder to append each line until you see a blank line. – Ted Bigham Mar 03 '14 at 21:48
  • im not sure how to use a string builder, I'm a beginner in java. I want the system to know when a paragraph starts and end then count each paragraph in file. So basically it should read lines of text that are together without empty line between them as one paragraph. hope this is clear – JD14 Mar 03 '14 at 22:09
  • the code doesn't work :(. this is the result: Number of lines: 0 Number of lines: 0 Number of lines: 0 Number of lines: 0 Number of lines: 0 ...... – JD14 Mar 03 '14 at 22:31
  • after editing your code now it just print first line in the file repeatedly – JD14 Mar 03 '14 at 22:35
  • 1
    Updated the answer (again). I needed to also read the next line inside the loop. Not all answers are tested on stack overflow, so you should expect to have to tweak the answer a little bit. Use it more of a guide, not a drop in replacement. – Ted Bigham Mar 03 '14 at 23:06
  • your updated code currently just print first line of text repeatedly – JD14 Mar 03 '14 at 23:38
  • I just tried it and it worked for me with a little changes. All i had to do was handle the case for the last paragraph when the file doesn't end with a blank line. I've updated the answer. That code is tested, but the *idea* has not changed. – Ted Bigham Mar 04 '14 at 02:41
0

You will not be able to see the spaces or newline characters using Scanner. nextLine() method eliminates the \n's.

You need to use a class and methods that reads the bytes of the file so you can detect the spaces and newline characters.

Try to use read() method of FileInputStream.

Arjit
  • 3,290
  • 1
  • 17
  • 18