How can I remove non-ASCII characters (Altcodes) from a string such as: → ← █ ◄ ► ∙
-
1`yourStringVariable.replaceAll("[^\\x20-\\x7E]", "");`. – DevilsHnd - 退職した Dec 06 '20 at 04:36
-
What exactly qualifies as AltCode? `ö`? `€`? Emojis? Do you want to only keep ASCII characters? – Siguza Dec 06 '20 at 04:37
-
Oh...just to be clear, all characters in ASCII have an ALT Code number (A is 065, B is 066, etc). – DevilsHnd - 退職した Dec 06 '20 at 04:50
-
I only want to keep ASCII Characters – RecursiveMethod Dec 06 '20 at 05:03
1 Answers
From your comment, by "AltCode", you're referring to any non-ASCII character.
One solution to this problem would be use the method String.replaceAll(String regex, String replacement)
. This method replaces all instances of the given regular expression (regex) with a given replacement string.
Replaces each substring of this string that matches the given regular expression with the given replacement.
Java has the "\p{ASCII}" pattern which match only ASCII characters. This can be negated using "[^…]" syntax to match any non-ASCII characters instead. The matched characters can then be replaced with the empty string, effectively removing them from the resulting string.
String s = "A→←B█◄C►";
String stripped = s.replaceAll("[^\\p{ASCII}]", "");
System.out.println(stripped); // Prints "ABC"
The full list of valid regex pattern characters is documented in the Pattern
class.
Note: If you are going to be calling this pattern multiple times within a run, it will be more efficient to use a compiled Pattern
directly, rather than String.replaceAll
. This way the pattern is compiled only once and reused, rather than each time replaceAll
is called:
public class AsciiStripper {
private static final Pattern NON_ASCII_PATTERN = Pattern.compile("[^\\p{ASCII}]");
public String stripAscii(String s) {
return NON_ASCII_PATTERN.matcher(s).replaceAll("");
}
}

- 14,487
- 7
- 91
- 130