2

I when i use reader.readLine(), the string length is always 80 chars and after the main string unicode spaces are padded up. Is there a way to remove those unwanted characters. (java.io.RandomAccessFile reader) String.trim is not working on this

  • 1
    The question is a bit too narrow. For example, I searched stackoverflow for "[java] internationalization trim" and "[java] unicode trim" and did *not* find this question. You really want a trim() function that is I18N/Unicode aware; if the question was phrased that way, more people would be able to find the answer below. – djb Oct 09 '15 at 13:49

5 Answers5

7

You can use StringUtils.strip from Commons Lang. It is Unicode-aware.

Thilo
  • 257,207
  • 101
  • 511
  • 656
  • well it uses Character.isWhitespace which doesn't work well. It should use newer Character.isSpaceChar to be fully unicode-aware – Michal Bernhard Aug 12 '16 at 11:12
3

You can write a custom method in Java to remove the Unicode space characters , using Character.isWhitespace(char) and Character.isSpaceChar(char) methods, for your specific purpose.

The Spring framework has a StringUtils class with a trimWhitespace(String) method which appears to be based on Character.isWhitespace(char) from the source code here.

AllTooSir
  • 48,828
  • 16
  • 130
  • 164
0

use Google Guava

CharMatcher.WHITESPACE.trimFrom(source);

or try this https://gist.github.com/michalbcz/4861a2b8ed73bb73764e909b87664cb2

Michal Bernhard
  • 3,853
  • 4
  • 27
  • 38
0

If you do not want a big libs. Just use:

str.replaceAll("^[\\s\\p{Z}]+|[\\s\\p{Z}]+$", "");

Testing

    public static String trim(String str) {
        return str.replaceAll("^[\\s\\p{Z}]+|[\\s\\p{Z}]+$", "");
    }

    public static void main(String[] args) {
        System.out.println(trim("\t tes ting \u00a0").length());
        System.out.println(trim("\t testing \u00a0").length());
        System.out.println(trim("tes ting \u00a0").length());
        System.out.println(trim("\t tes ting").length());
    }
lehanh
  • 569
  • 7
  • 22
-2

would have been faster to just search stackoverflow for this question becoz there are multiple questions on that topic there. well, try this:

st.replaceAll("\\s","")

check this one here: link

Community
  • 1
  • 1
bofredo
  • 2,348
  • 6
  • 32
  • 51