-2

I want to split the sentences into one sentence per line in Java.

Input String: "Volatility returned to the municipal bond market during the first half of the funds’ fiscal year as investors weighed the potential impact of the U.S. presidential election, strengthening economic conditions and rising interest rates. The market was further pressured by a record level of municipal bond issuance in 2016. Against this backdrop, all six funds registered declines, ranging from –0.92% for American Funds Short-Term Tax-Exempt Bond Fund to –3.77% for American High-Income Municipal Bond Fund. (See pages 4 through 10 for fund specific results and information.)"

Output:

Sentence1: Volatility returned to the municipal bond market during the first half of the funds’ fiscal year as investors weighed the potential impact of the U.S. presidential election, strengthening economic conditions and rising interest rates.

Sentence2: The market was further pressured by a record level of municipal bond issuance in 2016. Against this backdrop, all six funds registered declines, ranging from –0.92% for American Funds Short-Term Tax-Exempt Bond Fund to –3.77% for American High-Income Municipal Bond Fund.

Sentence3:(See pages 4 through 10 for fund specific results and information.

I have written a java code to split the Sentences when .('Full stop') occurs, A new line has been coming after U.S.

string = string.replace(". ", ".\n")

3 Answers3

1

You could use String::split with regex to accomplish this like so:

String[] sentences = paragraph.split("(?<=[^ ]\\.) (?=[^a-z])");
int count = 0;
for(String str:sentences)
    System.out.println("Sentence " + (++count) + ":" + str);

This uses advanced regex techniques called look ahead and look behind to retain the delimiters upon matching.

CraigR8806
  • 1,584
  • 13
  • 21
  • Not working. Split when . occurs.Volatility returned to the municipal bond market during the first half of the funds’ fiscal year as investors weighed the potential impact of the U . S . presidential election, strengthening economic conditions and rising interest rates. The market was further pressured by a record level of municipal bond issuance in 2016. Against this backdrop, all six funds registered declines, ranging from –0.92% for American Funds Short-Term Tax-Exempt Bond Fund to –3.77% for American High-Income Municipal Bond Fund.... – Surjit Patra Jun 06 '17 at 10:42
  • For the test case given it works, but I will edit it and add in the test for a space before the period – CraigR8806 Jun 06 '17 at 10:44
  • @SurjitPatra give it a try now – CraigR8806 Jun 06 '17 at 10:45
0

String#split() takes a regex. In regex, . means anything other than \n. Escape the dot using \, so the resulting parameter becomes \\.

ArsenArsen
  • 312
  • 6
  • 16
0

Try something like this inside your code:

List<String> eachLine = new ArrayList<String>();
String initialString = new String("Volatility returned to the municipal bond market during the first half of the funds’ fiscal year as investors weighed the potential impact of the U.S. presidential election, strengthening economic conditions and rising interest rates. The market was further pressured by a record level of municipal bond issuance in 2016. Against this backdrop, all six funds registered declines, ranging from –0.92% for American Funds Short-Term Tax-Exempt Bond Fund to –3.77% for American High-Income Municipal Bond Fund. (See pages 4 through 10 for fund specific results and information.)");

int stopIndex = initialString.indexOf( '. ' );//I am searching for the first occurance of '. ' in the string. 
//Note full stop followed blank space, which would denote either end of a sentence or words like U.K. or U.S. etc.

boolean UpperCase = checkForUpperCase(stopIndex+1);//write a function to check whether the alphabet/character following '. ' is in uppercase or not
//checking for Uppercase because a senetence starts with Uppercase
if(UpperCase){
   eachLine.add(initialString.substring(0,stopIndex));//add the sentence to List<String> to be processed later
   initialString = initialString.substring(stopIndex+1);//storing the rest of the sentence in the same string to be processed again
}
//keep parsing till you parse the whole string

You can get general idea regarding how you may check for Uppercase from here: Java Program to test if a character is uppercase/lowercase/number/vowel

The aforementioned code is just a snippet to provide you understanding of how you may achieve your goal or approach your issue.

You can also use Regular Expressions to find the full stop pattern as well, but understanding the basic approach might be more useful later.

Regular Expressions in Java: https://www.tutorialspoint.com/java/java_regular_expressions.htm

Somdip Dey
  • 3,346
  • 6
  • 28
  • 60