-2

A String looks like this : " I am seal \n\n \t where are we? ". and the printed version

   I am  seal 

      where are we? 

I want to make the string like : "I am seal\nwhere are we?"

I am seal
where are we?

I am removing the new line with "[\r\n]+", "\n" this regex but the problem is when I am trying to remove the white space it also remove the newline. I have used StringUtils from Apache-common.

update

Also white space from the beginning of a line also be removed. It should not be consecutive.

How can I achieve this in Java ?

Thank you.

seal
  • 1,122
  • 5
  • 19
  • 37

3 Answers3

4

Update #2

Catching all initial white-spaces (caret ^ asserts that we are at beginning of line) and other consecutive spaces:

^\\s+|[\\t\\f ](?=[\\t\\f ])|[\\t\\f ]$|\\s+\\z

Replace it with nothing (multi-line modifier is important to be on):

String str = "   I am   seal \n\n  \t   where are we? ";
String result = str.replaceAll("(?m)(^\\s+|[\\t\\f ](?=[\\t\\f ])|[\\t\\f ]$|\\s+\\z)", "");
System.out.println(result);

Live demo

Also by the help of class intersection we can use a shorter regex:

^\\s+|[\\s&&[^\\r\\n]](?=\\s|$)|\\s+\\z
revo
  • 47,783
  • 14
  • 74
  • 117
  • Thanks for the reply. your regex only work with the white space at beginning of the line. But it should remove all consecutive white space in every line. – seal Aug 23 '16 at 17:58
  • nope. it should remove all consecutive white space and also new line whether at beginning, middle or end. :) – seal Aug 23 '16 at 18:03
  • removing the white space from beginning of the second line is also needed. though it not necessarily consecutive. – seal Aug 23 '16 at 18:08
  • I got it. Check update @seal – revo Aug 23 '16 at 18:22
  • it works . :) but at the end of the string some may be two new line remaining. How to remove that ? – seal Aug 23 '16 at 18:29
  • That's working fine. that was my observation mistake. thanks. – seal Aug 23 '16 at 18:42
  • could you explain what `(?=[\\t\\f ])|[\\t\\f ]` this segment dose ? – seal Aug 23 '16 at 20:10
  • 1
    Each pipe `|` denotes an alternation so you actually mean `[\\t\\f ](?=[\\t\\f ])` or `[\\t\\f ]$`. The first, means white-spaces (`\t` tab, `\f` form-feed or space character) that are followed by another white-space of the same character class. Second side means all of those white-space characters that appears at the end of each line. @seal – revo Aug 23 '16 at 20:24
  • which part is responsible for multiple `new line` ? – seal Aug 23 '16 at 21:03
  • 1
    Those are caught by both `\\s+\\z` (whitespaces at the end of the string) and `^\\s+` (whitespaces at the beginning of each line). `\s` matches any character of `[\r\n\t\f ]` class. @seal – revo Aug 23 '16 at 21:09
1

There is a difference between all whitespace and newline in your question.
Using a single regex, you could decide which to replace with.

Either a space, or a newline.
Alas, this requires a callback function to see which one matched.
([^\S\r\n])+|(?:\r?\n)+
Group 1 ? replace with space : else replace with newline.

The easier way is to do it in 2 separate steps.

Replace all [^\S\r\n]+ with a space.
Then Replace all (?:\r?\n)+ with a newline.

You could use a range {2,} instead of + which might give you a marginal
performance boost.

  • Thanks for the reply. there is single white space in every line except first line, after trimed the whole string. It should also be removed. It might like no white space after a new line. I have update my question. – seal Aug 23 '16 at 18:18
  • 1
    Then you'd want this `[^\S\r\n]*(?:\r?\n)+[^\S\r\n]*|([^\S\r\n])+` or, in 2 steps, `[^\S\r\n]*(?:\r?\n)+[^\S\r\n]*` first, then `[^\S\r\n]+` and don't use the `{2,}` range quantifier. –  Aug 23 '16 at 19:19
0
    str = str.trim().replace("\t", " ");
    while (str.contains("  ") || str.contains("\n\n")) {
        while (str.contains("  ")) {
            str = str.replace("  ", " ");
        }
        while (str.contains("\n\n")) {
            str = str.replace("\n\n", "\n");
        }
        while (str.contains("\n ")) {
            str = str.replace("\n ", "\n");
        }

    }
omkar sirra
  • 696
  • 10
  • 28