17

What are kinds of whitespaces in Java? I need to check in my code if the text contains any whitespaces.

My code is:

if (text.contains(" ") || text.contains("\t") || text.contains("\r") 
       || text.contains("\n"))   
{  
   //code goes here
}   

I already know about \n ,\t ,\r and space.

Tom
  • 16,842
  • 17
  • 45
  • 54
mkounal
  • 773
  • 2
  • 7
  • 23
  • http://stackoverflow.com/questions/4731055/whitespace-matching-regex-java – perilbrain Aug 08 '12 at 11:32
  • 1
    I had to change one line of code to `if (Character.isWhitespace(text.charAt(i)) || Character.isSpaceChar(text.charAt(i))) {` to get the results I wanted. – ericharlow Nov 25 '13 at 23:19

8 Answers8

23

For a non-regular expression approach, you can check Character.isWhitespace for each character.

boolean containsWhitespace(String s) {
    for (int i = 0; i < s.length(); ++i) {
        if (Character.isWhitespace(s.charAt(i)) {
            return true;
        }
    }
    return false;
}

Which are the white spaces in Java?

The documentation specifies what Java considers to be whitespace:

public static boolean isWhitespace(char ch)

Determines if the specified character is white space according to Java. A character is a Java whitespace character if and only if it satisfies one of the following criteria:

  • It is a Unicode space character (SPACE_SEPARATOR, LINE_SEPARATOR, or PARAGRAPH_SEPARATOR) but is not also a non-breaking space ('\u00A0', '\u2007', '\u202F').
  • It is '\u0009', HORIZONTAL TABULATION.
  • It is '\u000A', LINE FEED.
  • It is '\u000B', VERTICAL TABULATION.
  • It is '\u000C', FORM FEED.
  • It is '\u000D', CARRIAGE RETURN.
  • It is '\u001C', FILE SEPARATOR.
  • It is '\u001D', GROUP SEPARATOR.
  • It is '\u001E', RECORD SEPARATOR.
  • It is '\u001F', UNIT SEPARATOR.
Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
12
boolean containsWhitespace = false;
for (int i = 0; i < text.length() && !containsWhitespace; i++) {
    if (Character.isWhitespace(text.charAt(i)) {
        containsWhitespace = true;
    }
}
return containsWhitespace;

or, using Guava,

boolean containsWhitespace = CharMatcher.WHITESPACE.matchesAnyOf(text);
JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255
  • 4
    or put that in a method to return the boolean and avoid the awkward `break` and "accumulator" variable. – Thilo Aug 08 '12 at 11:31
  • Would there be any reason here to use [Character#isWhitespace(int)](http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isWhitespace(int)) instead of the suggested [Character#isWhitespace(char)](http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isWhitespace(char))? – Martin Andersson Apr 03 '13 at 18:32
  • 1
    @MartinAndersson (2 years late) the [Character#isWhitespace(int)](http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isWhitespace(int)) version accepts a [codepoint](https://en.wikipedia.org/wiki/Code_point) which you would get using [Character#codePointAt(...)](http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#codePointAt(char[],%20int)). The codepoint could span multiple chars in the string. For example the documentation mentions that is detects LINE_SEPARATOR and PARAGRAPH_SEPARATOR '\u2007', '\u202F' which are more than 8 bits wide – Sparky Dec 08 '15 at 19:32
  • The `!containsWhiteSpace` test should come first, not last, and in any case the variable is redundant, as the `true` case can be replaced by `return true` and the fall-through case by `return false`. – user207421 Dec 01 '16 at 10:29
  • Or you can leave the loop with break; after containsWhitespace was set to true. This would lead to a simpler condition in the for loop. – Lord_PedantenStein Mar 24 '18 at 19:24
3

If you want to consider a regular expression based way of doing it

if(text.split("\\s").length > 1){
    //text contains whitespace
}
James
  • 2,483
  • 2
  • 24
  • 31
2

Use Character.isWhitespace() rather than creating your own.

In Java how does one turn a String into a char or a char into a String?

Community
  • 1
  • 1
Aravind Yarram
  • 78,777
  • 46
  • 231
  • 327
2

If you can use apache.commons.lang in your project, the easiest way would be just to use the method provided there:

public static boolean containsWhitespace(CharSequence seq)

Check whether the given CharSequence contains any whitespace characters.

Parameters:

seq - the CharSequence to check (may be null) 

Returns:

true if the CharSequence is not empty and contains at least 1 whitespace character

It handles empty and null parameters and provides the functionality at a central place.

Nicktar
  • 5,548
  • 1
  • 28
  • 43
0

From sun docs:

\s A whitespace character: [ \t\n\x0B\f\r]

The simplest way is to use it with regex.

MByD
  • 135,866
  • 28
  • 264
  • 277
0
boolean whitespaceSearchRegExp(String input) {

    return java.util.regex.Pattern.compile("\\s").matcher(input).find();

} 
doxmoxbox
  • 11
  • 3
0

Why don't you check if text.trim() has a different length? :

if(text.length() == text.trim().length() || otherConditions){
    //your code
}
Wai Ha Lee
  • 8,598
  • 83
  • 57
  • 92