1

I want to turn two-character strings representing special non-printing characters, such as "\t","\n","\a",... into their single character representations. Is there any general way to do it (with general, I mean an alternative to explicitly translating each string into each character)?

Ideal behavior (my strings are being read from a file. This is just an example):

String str = "\\t";
char c = toChar(str); // c now represents the tab character '\t'
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
136
  • 1,083
  • 1
  • 7
  • 14
  • 1
    Have you taken a look at this [question](https://stackoverflow.com/questions/1327355/is-there-a-java-function-which-parses-escaped-characters)? It seems to cover your case with an Apache Commons method, if you are able to use that. Just realized that question is really old though, so maybe out of date. – Nexevis Aug 31 '23 at 18:44
  • @Nexevis The Answers on [that Question](https://stackoverflow.com/q/1327355/642706) are outdated. None use the new functionality built into Java 15+. See [my Answer](https://stackoverflow.com/a/77019207/642706) here for details. – Basil Bourque Sep 01 '23 at 02:21

3 Answers3

4

tl;dr

In Java 15+:

    ( "\\" + "t" ).translateEscapes()  -->  TAB character
    ( "\\" + "n" ).translateEscapes()  -->  LINE FEED character
    ( "\\" + "s" ).translateEscapes()  -->  SPACE character
    …

String#translateEscapes

In Java 15+, use String#translateEscapes method.

The char type has been essentially broken since Java 2, legacy since Java 5. Make a habit of using code point integers and String objects instead.

final String input = "\\" + "t";
final String TAB = input.translateEscapes ( );

Dump to console. Use code points to inspect actual contents of each string.

System.out.println ( "input = " + input );
System.out.println ( "input code points = " + Arrays.toString ( input.codePoints ( ).toArray ( ) ) );
System.out.println ( "TAB code points = " + Arrays.toString ( TAB.codePoints ( ).toArray ( ) ) );

When run, we verify that we indeed began with a two-character string, a back-slash and a "t". After our call, we transformed that input into a single TAB character with a code point of nine.

input = \t
input code points = [92, 116]
TAB code points = [9]
Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154
2

You can utilize the StringEscapeUtils class from the Apache Commons Text library to achieve your goal, this way:

String str = "\\t";
String unescapedStr = StringEscapeUtils.unescapeJava(str);
char c = unescapedStr.charAt(0);

The StringEscapeUtils.unescapeJava method converts escape sequences in a string to their corresponding characters. In this case, it will convert \\t to the tab character '\t'.

Remember to include the Apache Commons Text library in your project's dependencies!

But, if for some reason you don't want/can't use a library, you can do the toChar method this way:

public static char toChar(String str) {
    switch (str) {
    case "\\t":
        return '\t';
    case "\\b":
        return '\b';
    case "\\n":
        return '\n';
    case "\\r":
        return '\r';
    case "\\f":
        return '\f';
    case "\\'":
        return '\'';
    case "\\\"":
        return '\"';
    case "\\\\":
        return '\\';
    default:
        throw new IllegalArgumentException();
    }
}
Diego Borba
  • 1,282
  • 8
  • 22
0

The String class itself has a translateEscapes method which does this:

String str = "\\t";
char c = str.translateEscapes().charAt(0);
VGR
  • 40,506
  • 4
  • 48
  • 63