0

I am working on a Java based Android app which uses a custom web font to show different icons. To use the icons I have created a simple HashMap:

Map<Integer, String> iconMaß = new HashMap<String, String>() {
    {
        put("help", "\ue004");
        put("info", "\ue005");
        ...
        put("search", "\u0022");
        put("delete", "\u005c");
    }
};

This works fine, except that using "\u005c" and "\u0022" is not possible. "\u0022" represents " and "\u005c" is \. It seems that compiler translates the escaped unicode character and "\" is no valid string of course. However, using "\\u005c" does not work either, since now the first backslash escaped the second one and instead of having one unicode character I now get the string \u005c` (sixs chars long)...

So, how to escape the unicode chars correctly?

Of course I could solve this specific problem by using \ and " instead. However, I would like to be sure that the problem does not show up with other chars as well and I would like to know how to properly escape the unicode chars.

BTW: Using "\u005c" and "\u0022" in Kotlin is no problem and delivers the correct result.

Andrei Herford
  • 17,570
  • 19
  • 91
  • 225
  • A helpful post on Java and Unicode: https://stackoverflow.com/questions/2533097/java-unicode-encoding – Zack Macomber Mar 02 '21 at 15:30
  • Double-quote and backslash are the only characters that behave specially in String literals. So their Unicode equivalents `\u022` and `\005c` are the only Unicode escapes that will also behave specially. So, problem solved, I think (;-D) – Kevin Anderson Mar 02 '21 at 15:38
  • If you can format like this, the backslash (and other unicode chars) should print right: `int test = 0x005c;` `System.out.println((char)test);` – Zack Macomber Mar 02 '21 at 15:47

2 Answers2

6

\u in java is not a string escape. It's an escape that is picked up directly by the parser itself. This is valid java:

String x = \u0022Hello\u0022;

The reasoning is fairly simple: Sometimes, you edit source files in e.g. US-ASCII, or ISO-8859-1, but you still want to put, say, a unicode snowman in your source file, which would then be impossible.

To put a backslash in a java string, "\\" is all you need. For a quote, "\"" is all you need. If you insist on always using the number for some bizarre reason, octal escapes are available but do not exceed 255 (so you can just cover ASCII-and-a-bit with these). Otherwise, construct them. Easy enough. Thus, either:

put("delete", "\"");

or

put("delete", "" + (char) 0x5C);

Kotlin made a different decision and more or less posits that you edit your source files in UTF-8, period. Java made the decision that it's a bridge too far to just decree this. Possibly related to the fact that java's origins are about 25 years earlier than kotlin's. Back then UTF-8 was a pretty cool idea, instead of a de facto standard.

rzwitserloot
  • 85,357
  • 5
  • 51
  • 72
2

This seem like pervious explaination

Also this adding this way also worked

 Map<String, String> testMap = new HashMap<String, String>()
        {
            {
                put("help", "\ue004");
                put("info", "\ue005");
                put("search", "\u005c\u0022");
                put("delete", "\u005c\u005c");
            };
        };

        testMap.entrySet().forEach(
                entry -> {
                    System.out.println(entry.getKey() + entry.getValue());
                }
        );
Wang Du
  • 101
  • 2