Byte[] Array to String C# to Java Without Encoding

Question

This piece of C# code (to my knowledge) converts byte[] to String without using encoding. Now I'm trying to do the same in Java but I can't find the right encoding to produce the same result.

TLDR: I'm looking for the Java code/solution which produces the same result as the C# code bellow.

C#:

public static byte[] StringToByteArray(String value){
        byte[] bytes = new byte[value.Length * sizeof(char)];
        Buffer.BlockCopy(value.ToCharArray(), 0, bytes, 0, bytes.Length);
        return bytes;
}

public static String ByteArrayToString(byte[] value){
        char[] chars = new char[value.Length / sizeof(char)];
        Buffer.BlockCopy(value, 0, chars, 0, value.Length);
        return new string(chars);
}

"without using encoding" -- that code still "uses an encoding" in the sense that the output version produces UTF-16-encoded output, and the input version expects it to be UTF-16-encoded input. — Joe Amenta, May 20 '16 at 13:19
Every conversion between a string and a byte[] uses an encoding, whether explicit or implicit. — ManoDestra, May 20 '16 at 13:23
Thinking about it now it does make sense, since each byte needs to be mapped to "something" and that "something" is based on what Encoding I use. Thank you. — user5581557, May 20 '16 at 13:25
There's literally no such thing as 'without using encoding'; encoding is how characters translate to bytes. In modern environments, the overall optimal text encoding to use is probably UTF-8, though, not the C# default UTF-16. — Nyerguds, May 20 '16 at 13:28
This may help you a little further, regarding the Encoding.Default property that you can also use, as it gives some explanation as to the default encoding in general: https://msdn.microsoft.com/en-us/library/system.text.encoding.default%28v=vs.110%29.aspx. If you want to be certain of the encoding used however, then you should fix it to something specific yourself, of course. — ManoDestra, May 20 '16 at 13:29
Found this link as well, which may be of some assistance to you, regarding encodings in Java: http://stackoverflow.com/questions/5729806/encode-string-to-utf-8 — ManoDestra, May 20 '16 at 17:58

score 0 · Answer 1 · answered May 20 '16 at 13:19

0

public static byte[] StringToByteArray(String value){
        return Encoding.UTF8.GetBytes(value)
}

public static String ByteArrayToString(byte[] value){
        return Encoding.UTF8.GetString(value)
}

answered May 20 '16 at 13:19

Alper Şaldırak

1,034
8
10

1

Careful there, though. This will put a 3-byte byte order mark at the start of all output bytes. You need to use `new UTF8Encoding(False).GetBytes(value)`. – Nyerguds May 20 '16 at 13:31
This doesn't look like Java code to me. What Java classes are you using here? Or have you written C# code by mistake? – ManoDestra May 20 '16 at 17:57

score 0 · Answer 2 · edited May 23 '17 at 11:52

As per your request, here's some examples of the equivalent code in Java...

The Basics

// Converts the string using the default character encoding
// (equivalent of Encoding.Default in C#).
byte[] bytes = text.getBytes();

// Converts the string using the given character encoding, in this case UTF-8.
byte[] bytes = text.getBytes("UTF-8");

// Converts a byte array to a string using the default encoding.
String text = new String(bytes);

// Converts a byte array to a string using the given encoding.
String text = new String(bytes, "UTF-8");

Code Example

public class EncodingTest {
    public static void main(String[] args) {
        try {
            String originalText = "The quick brown fox jumped over the lazy dog.";

            byte[] defaultBytes = originalText.getBytes();
            byte[] utf8Bytes = originalText.getBytes("UTF-8");
            byte[] utf16Bytes = originalText.getBytes("UTF-16");
            byte[] isoBytes = originalText.getBytes("ISO-8859-1");

            System.out.println("Original Text: " + originalText);
            System.out.println("Text Length: " + originalText.length());

            System.out.println("Default Bytes Length: " + defaultBytes.length);
            System.out.println("UTF-8 Bytes Length: " + utf8Bytes.length);
            System.out.println("UTF-16 Bytes Length: " + utf16Bytes.length);
            System.out.println("ISO-8859-1 Bytes Length: " + isoBytes.length);

            String newDefaultText = new String(defaultBytes);
            String newUtf8Text = new String(utf8Bytes, "UTF-8");
            String newUtf16Text = new String(utf16Bytes, "UTF-16");
            String newIsoText = new String(isoBytes, "ISO-8859-1");

            System.out.println("New Default Text: " + newDefaultText);
            System.out.println("New UTF-8 Text: " + newUtf8Text);
            System.out.println("New UTF-16 Text: " + newUtf16Text);
            System.out.println("New ISO-8859-1 Text: " + newIsoText);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

More info on String constructors here.

And some further tutorials on Java encodings here.

And as I stated in comments, there is no such thing as "without encoding" for string/byte conversions. There may be an implicit or default encoding being used, but there's always an encoding required to convert from string to byte[] and vice versa.

Also: How to convert Java String into byte[]?

Byte[] Array to String C# to Java Without Encoding

2 Answers2