0

How to check whether a String contains all '\r' \t' '\n'...other than spaces?

For example, String a = "a\nb", String b = "a b". I want return true for string a, false for string b.

I know there is Character.isWhiteSpace(char c), and Pattern.compile("\\s").matcher(string).find(). But they all take space(' ') into account. What I want is find out all escape characters which is considered as whitespace by Character.isWhiteSpace(char c) method except for ' '.

And I don't want to check char by char, it will be the best if there is a proper regex and I can use like Pattern.compile.

weston
  • 54,145
  • 21
  • 145
  • 203
yunjing li
  • 77
  • 7

4 Answers4

3

Like this?

@Test
    public void testLines() {
        assertTrue(Pattern.compile("[\n\r\t]").matcher("a\nb").find());
        assertFalse(Pattern.compile("[\n\r\t]").matcher("a b").find());
    }
Jochen Bedersdorfer
  • 4,093
  • 24
  • 26
2

You could use [^\S ] which matches everything but \S (non-whitespace) or (space).

Pattern pattern = Pattern.compile("[^\\S ]");

String a = "a\nb";
String b = "a b";

System.out.println(pattern.matcher(a).find()); // true
System.out.println(pattern.matcher(b).find()); // false
Bubletan
  • 3,833
  • 6
  • 25
  • 33
1

I assume that when you say "all '\r' \t' '\n'...other than spaces", what you mean is "any whitespace character other than U+0020" (where U+0020 is a simple space). Is this correct?

If so, then the following regex (general form) should work:

(?! )\s

This will match any whitespace character that is not a simple space. This regex makes use of negative lookahead.


EDIT:

As @Bubletan states in their answer, the following regex will also work:

[^\S ]

Both of these regex are equivalent. This is because (?! )\s ≣ "(is NOT the character U+0020) AND (is whitespace)" and [^\S ] ≣ "is NOT (non-whitespace OR the character U+0020) have the same truth table:

Let P(x) be the predicate "x is the character U+0020"
Let Q(x) be the predicate "x is whitespace"

P | Q | (¬P)∧Q | ¬(¬Q∨P)
–– ––– –––––––– ––––––––
T   T      F       F
T   F      T       T
F   T      F       F
F   F      F       F

Although for the sake of efficiency, you are probably better off using @Bubletan's solution ([^\S ]). Lookaround is generally slower than the alternative.

This is how you could implement it:

// Create the pattern.  (do only once)
Pattern pattern = Pattern.compile("[^\\S ]");

// Test an input string.  (do for each input)
Matcher matcher = pattern.matcher(string);
boolean result = matcher.find();

result will then indicate whether string contains any whitespace other than a simple space.

Community
  • 1
  • 1
Travis
  • 2,135
  • 17
  • 29
0

In Java, use [^\\h]+ . \h means all kinds of horizontal spaces. But in other languages, it is not available as far as I know.

Kang Andrew
  • 330
  • 2
  • 14