4

The Java node receives an Erlang string encoded in UTF-8. Its class type is OtpErlangString. If I simply do .toString() or .stringValue() the resulting java.lang.String has invalid codepoints (basically every byte from the Erlang string is considered distinct character).

Now, I want to use new String(bytes, "UTF-8") when creating the Java String but how to get the bytes from the OtpErlangString?

user
  • 6,897
  • 8
  • 43
  • 79
Martin Dimitrov
  • 4,796
  • 5
  • 46
  • 62

1 Answers1

1

It's strange you get OtpErlangString on Java side when you use UTF8 characters. I get object of this type if I use ASCII characters only. If I add at least one UTF8 character, the resulting type is OtpErlangList (which is logical as strings are just lists of ints in Erlang) and then I can use its stringValue() method. So that after sending string form Erlang like:

(waco@host)8> {proc, java1@host} ! "ąćśźżęółńa".
[261,263,347,378,380,281,243,322,324,97]

On Java node I receive and print it with:

OtpErlangList l = (OtpErlangList) mbox.receive();
System.out.println(l.stringValue());

The output is correct:

ąćśźżęółńa

However, if its not the case in your situation, you could try to work it around by forcing OtpErlangList representation by e.g. adding an empty tuple as the very first element of the string list:

(waco@wborowiec)11> {proc, java1@wborowiec} ! [{}] ++ "ąćśźżęółńa".
[{},261,263,347,378,380,281,243,322,324,97]

And on Java side something like:

OtpErlangList l = (OtpErlangList) mbox.receive();
// get rid of an extra tuple
OtpErlangObject[] strArr = Arrays.copyOfRange(l.elements(), 1, l.elements().length);
OtpErlangList l2 = new OtpErlangList(strArr);
System.out.println(l2.stringValue());
Wacław Borowiec
  • 700
  • 1
  • 6
  • 12
  • Sending [208, 180, 208, 176] which is "да" ("yes" in Russian) results in a `OtpErlangString`. To include an empty tuple in front of the list to force the creation of `OtpErlangList` object is great but aren't there an easier solution? Isn't there a way to extract binary array from `OtpErlangString` object? – Martin Dimitrov Jan 19 '12 at 12:21