1

I want to remove all sorts of extra spaces between sentences and get it as one string for some processing

Eg:

The meaning of the phrase "ice cream" varies from one country to another. Phrases 
such as "frozen custard", "frozen yogurt", "sorbet", "gelato" and others are used 
to distinguish different varieties and styles.

In some countries, such as the United States, the phrase "ice cream" applies only
to a    specific variety, and most governments regulate the commercial use of
the   various terms according to the relative quantities of the main ingredients. 

Products that do not meet the criteria to be called ice cream are labelled
"frozen dairy dessert" instead. In other countries, such as Italy and 
Argentina, one word is used for all variants.Analogues made from dairy 
alternatives,  such as goat's or sheep's milk, or milk substitutes, are 
available for those who are lactose intolerant, allergic to dairy protein, 
or vegan.  The most popular flavours of ice cream in North America (based
 on consumer surveys) are vanilla and chocolate.

If I copy the above string in console then it takes only the first sentence then then evaluates it. I want to get this entire paragraph as a string . Is that possible and I tried a lot in this but it removes only white spaces inside a sentence. So it does not make any sense if we remove spaces between words. I want to remove spaces between sentences and paragraphs . Can anyone help me?

chopss
  • 771
  • 9
  • 19

3 Answers3

6

Use regular expression:

myText.trim().replaceAll("\\s+", " ");
Tunguska
  • 1,205
  • 3
  • 18
  • 37
  • I was going to suggest `[ ]{2,}` instead, but that works to ;) – MadProgrammer Apr 29 '14 at 06:52
  • 1
    @chopu- Read and add all text into a String. use String newPara = myText.trim().replaceAll("\\s+", " "); – TheLostMind Apr 29 '14 at 06:56
  • @chopu - consider Ankur's answer. Its correct. You might be reading only one line or you might replacing the string instead of appending. – TheLostMind Apr 29 '14 at 07:00
  • @WhoAmI the problem is I am reading this input through the console. Is there anyway to write the console input to a file. – chopss Apr 29 '14 at 07:04
  • @chopu - then use while loop with hasNextLine() and read each line. Use StringBuilder to create a String for the paragraph. Then split based on"\\s+" – TheLostMind Apr 29 '14 at 07:08
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/51647/discussion-between-chopu-and-whoami) – chopss Apr 29 '14 at 07:10
1

Try something like this:

    StringBuilder builder = new StringBuilder();
    BufferedReader reader = new BufferedReader(new FileReader("FILE-PATH"));
    String str = null;

    while ((str = reader.readLine()) != null) {
        builder.append(str.replaceAll("\\s+", ""));
    }

    // Complete paragraph without spaces.
    System.out.println(builder.toString());

Note: To remove spaces between paragraphs you need to replace '\n' new line characters from your String.

str.replaceAll("\n+", "")

Ankur Shanbhag
  • 7,746
  • 2
  • 28
  • 38
  • thanks for your reply. But the problem is that I'm reading this paragraph through console – chopss Apr 29 '14 at 07:01
  • That is even simple then. In that case you will receive entire paragraph as a single string. Just use 'str.replaceAll("\\s+", "")' in that case. – Ankur Shanbhag Apr 29 '14 at 07:05
  • no i just want to remove spaces between paragraphs and sentences not between words – chopss Apr 29 '14 at 07:22
0

I hope below snippet helps you.

public class RegexTest {

    public static void main(String[] args)
    {

        String text="this is para 1."
                + "\n\n"
                + "this is para 2."
                + "\n\n"
                + "This is para 3.";
        System.out.println("Text looks like :\n "+text);
        String text2=text.replaceAll("\\s", "");
        System.out.println("\nText2 looks like: \n"+text2);

    }
}

Output

Text looks like :
 this is para 1.

this is para 2.

This is para 3.

Text2 looks like: 
thisispara1.thisispara2.Thisispara3.
Parul S
  • 300
  • 1
  • 3
  • 10