A file consisting of special characters like ^A is stored in a byte array.how to detect these special characters and remove them?
Asked
Active
Viewed 2,534 times
0
-
The 'special characters' wouldn't happen to include `Â`? – artbristol Nov 03 '12 at 15:08
-
1The word you're looking for is **nonprintable** characters. http://stackoverflow.com/questions/7161534/fastest-way-to-strip-all-non-printable-characters-from-a-java-string http://stackoverflow.com/questions/6198986/how-can-i-replace-non-printable-unicode-characters-in-java – jonvuri Nov 03 '12 at 15:16
-
1You should really try to explain what you've already tried, and show us some code if that is available. – Maarten Bodewes Nov 04 '12 at 11:38
1 Answers
0
If you don't use a fancy encoding all uppercase letters will have values from 65 to 90 and all lowercase letters are between 97 and 122. (See ASCII Encoding). All bytes with other values are not letters of the alphabet.

jlordo
- 37,490
- 6
- 58
- 83
-
It's probably more useful in this case to note that the *nonprintable* or *control* characters go from 0-31 in ASCII - for instance, ^A, which is character 1. – jonvuri Nov 03 '12 at 15:20
-
@Kiyura Thanks for pointing it out.I'm looking for a way to get rid of non-printable characters.For the same, now I'm using replaceAll .Code snippet: String s=new String(buf,"US-ASCII") s=s.replaceAll("[^\\p{Print}]","") System.out.println(s) where buf is the byte array. Now when i try to print the truncated string(System.out.println) ,nothing is getting displayed on console.However, if i loop through the string and print each n every character, I'm getting the characters on console.Why so? – Manisha Nov 04 '12 at 09:25
-
@Manisha, You may be doing something wrong elsewhere in your code. This gist works fine for me: https://gist.github.com/4012570 Notice that if you save it as `Main.java`, compile with `javac Main.java`, and run in a terminal with `java Main`, the control characters are present the first time (the 'o' is backspaced over) and trimmed the second time, and both times the string prints fine. – jonvuri Nov 04 '12 at 16:51