334

I have to convert a byte array to string in Android, but my byte array contains negative values.

If I convert that string again to byte array, values I am getting are different from original byte array values.

What can I do to get proper conversion? Code I am using to do the conversion is as follows:

// Code to convert byte arr to str:
byte[] by_original = {0,1,-2,3,-4,-5,6};
String str1 = new String(by_original);
System.out.println("str1 >> "+str1);

// Code to convert str to byte arr:
byte[] by_new = str1.getBytes();
for(int i=0;i<by_new.length;i++) 
System.out.println("by1["+i+"] >> "+str1);

I am stuck in this problem.

peterh
  • 11,875
  • 18
  • 85
  • 108
Jyotsna
  • 4,021
  • 4
  • 22
  • 25
  • 5
    Why are you trying to convert arbitrary binary data to a String in the first place? Apart from all the charset problems the answers already mention, there's also the fact that you're abusing String if you do this. What's wrong with using a `byte[]` for your binary data and `String` for your text? – Joachim Sauer Oct 08 '09 at 08:16
  • 13
    @Joachim - sometimes you have external tools that can do things like store strings. You want to be able to turn a byte array into a (encoded in some way) string in that case. – James Moore Jul 11 '11 at 00:23

25 Answers25

499

Your byte array must have some encoding. The encoding cannot be ASCII if you've got negative values. Once you figure that out, you can convert a set of bytes to a String using:

byte[] bytes = {...}
String str = new String(bytes, StandardCharsets.UTF_8); // for UTF-8 encoding

There are a bunch of encodings you can use, look at the supported encodings in the Oracle javadocs.

Honza Zidek
  • 9,204
  • 4
  • 72
  • 118
omerkudat
  • 9,371
  • 4
  • 33
  • 42
  • 4
    @MauricePerry can you explain why it will not work with `UTF-8` ? – Asif Mushtaq Mar 31 '16 at 08:46
  • 14
    @UnKnown because UTF-8 encodes some characters as 2- or 3- byte strings. Not every byte array is a valid UTF-8-encoded string. ISO-8859-1 would be a better choise: here each character is encoded as a byte. – Maurice Perry Apr 01 '16 at 06:06
  • 2
    This might work, but you should avoid using String constructor at all cost. – hfontanez Jun 26 '17 at 17:13
  • to map one byte to one char (with 8859-1) and no exception handling (with nio.charset): `String str = new String(bytes, java.nio.charset.StandardCharsets.ISO_8859_1);` – iman Nov 20 '17 at 07:33
  • @MauricePerry thanks. utf-8 did not work for my input but base64 solved the problem – Al-Alamin Nov 28 '17 at 07:44
  • 4
    since Java 1.7, you can use new String(bytes, StandardCharsets.UTF_8) – ihebiheb Feb 18 '19 at 19:45
  • @MauricePerry But this question is asking how to convert from binary to string, and ISO-8859-1 (Latin-1) officially has still character values that don't map. Base64 or hexadecimals is the only good answer if the input bytes can have **any** value. UTF-8 is utterly ridiculous, and I wonder how this kind of shit answers get so many upvotes. – Maarten Bodewes Dec 27 '21 at 12:13
  • @MaartenBodewes Indeed: Base64 is a much better solution than any charset. – Maurice Perry Jan 03 '22 at 12:42
119

The "proper conversion" between byte[] and String is to explicitly state the encoding you want to use. If you start with a byte[] and it does not in fact contain text data, there is no "proper conversion". Strings are for text, byte[] is for binary data, and the only really sensible thing to do is to avoid converting between them unless you absolutely have to.

If you really must use a String to hold binary data then the safest way is to use Base64 encoding.

Kevin Kopf
  • 13,327
  • 14
  • 49
  • 66
Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720
  • 2
    Yes, [character encoding is something you must know about](http://stackoverflow.com/questions/10611455/what-is-character-encoding) to convert between strings and bytes. – Raedwald Apr 10 '15 at 12:15
  • 3
    Base64 encoding solved my problem. UTF-8 did not work for all inputs – Al-Alamin Nov 28 '17 at 07:42
45

The root problem is (I think) that you are unwittingly using a character set for which:

 bytes != encode(decode(bytes))

in some cases. UTF-8 is an example of such a character set. Specifically, certain sequences of bytes are not valid encodings in UTF-8. If the UTF-8 decoder encounters one of these sequences, it is liable to discard the offending bytes or decode them as the Unicode codepoint for "no such character". Naturally, when you then try to encode the characters as bytes the result will be different.

The solution is:

  1. Be explicit about the character encoding you are using; i.e. use a String constructor and String.toByteArray method with an explicit charset.
  2. Use the right character set for your byte data ... or alternatively one (such as "Latin-1" where all byte sequences map to valid Unicode characters.
  3. If your bytes are (really) binary data and you want to be able to transmit / receive them over a "text based" channel, use something like Base64 encoding ... which is designed for this purpose.

For Java, the most common character sets are in java.nio.charset.StandardCharsets. If you are encoding a string that can contain any Unicode character value then UTF-8 encoding (UTF_8) is recommended.

If you want a 1:1 mapping in Java then you can use ISO Latin Alphabet No. 1 - more commonly just called "Latin 1" or simply "Latin" (ISO_8859_1). Note that Latin-1 in Java is the IANA version of Latin-1 which assigns characters to all possible 256 values including control blocks C0 and C1. These are not printable: you won't see them in any output.

From Java 8 onwards Java contains java.util.Base64 for Base64 encoding / decoding. For URL-safe encoding you may want to to use Base64.getUrlEncoder instead of the standard encoder. This class is also present in Android since Android Oreo (8), API level 26.

Maarten Bodewes
  • 90,524
  • 13
  • 150
  • 263
Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
34

We just need to construct a new String with the array: http://www.mkyong.com/java/how-do-convert-byte-array-to-string-in-java/

String s = new String(bytes);

The bytes of the resulting string differs depending on what charset you use. new String(bytes) and new String(bytes, Charset.forName("utf-8")) and new String(bytes, Charset.forName("utf-16")) will all have different byte arrays when you call String#getBytes() (depending on the default charset)

Ravindranath Akila
  • 209
  • 3
  • 35
  • 45
  • 10
    No. The bytes of the resulting string differs depending on what charset you use. `new String(bytes)` and `new String(bytes, Charset.forName("utf-8"))` and `new String(bytes, Charset.forName("utf-16"))` will all have different byte arrays when you call `String#getBytes()` (depending on the default charset) – dutoitns Feb 19 '15 at 06:04
  • 1
    Misleading. The `char`s (and thereby the text displayed) of the resulting `String` differs when decoding `bytes` differently. The conversion back to bytes using the default encoding (use `String#getBytes("charset")` to specify otherwise) will necessarily differ because it converts different input. Strings don't store the `byte[]` they were made from, `char`s don't have an encoding and a `String` does not store it otherwise. – zapl May 24 '16 at 08:39
16

Using new String(byOriginal) and converting back to byte[] using getBytes() doesn't guarantee two byte[] with equal values. This is due to a call to StringCoding.encode(..) which will encode the String to Charset.defaultCharset(). During this encoding, the encoder might choose to replace unknown characters and do other changes. Hence, using String.getBytes() might not return an equal array as you've originally passed to the constructor.

Sai Kishore
  • 326
  • 1
  • 7
  • 16
sfussenegger
  • 35,575
  • 15
  • 95
  • 119
11

Why was the problem: As someone already specified: If you start with a byte[] and it does not in fact contain text data, there is no "proper conversion". Strings are for text, byte[] is for binary data, and the only really sensible thing to do is to avoid converting between them unless you absolutely have to.

I was observing this problem when I was trying to create byte[] from a pdf file and then converting it to String and then taking the String as input and converting back to file.

So make sure your encoding and decoding logic is same as I did. I explicitly encoded the byte[] to Base64 and decoded it to create the file again.

Use-case: Due to some limitation I was trying to sent byte[] in request(POST) and the process was as follows:

PDF File >> Base64.encodeBase64(byte[]) >> String >> Send in request(POST) >> receive String >> Base64.decodeBase64(byte[]) >> create binary

Try this and this worked for me..

File file = new File("filePath");

        byte[] byteArray = new byte[(int) file.length()];

        try {
            FileInputStream fileInputStream = new FileInputStream(file);
            fileInputStream.read(byteArray);

            String byteArrayStr= new String(Base64.encodeBase64(byteArray));

            FileOutputStream fos = new FileOutputStream("newFilePath");
            fos.write(Base64.decodeBase64(byteArrayStr.getBytes()));
            fos.close();
        } 
        catch (FileNotFoundException e) {
            System.out.println("File Not Found.");
            e.printStackTrace();
        }
        catch (IOException e1) {
            System.out.println("Error Reading The File.");
            e1.printStackTrace();
        }
Rupesh
  • 2,627
  • 1
  • 28
  • 42
7

Even though

new String(bytes, "UTF-8")

is correct it throws a UnsupportedEncodingException which forces you to deal with a checked exception. You can use as an alternative another constructor since Java 1.6 to convert a byte array into a String:

new String(bytes, StandardCharsets.UTF_8)

This one does not throw any exception.

Converting back should be also done with StandardCharsets.UTF_8:

"test".getBytes(StandardCharsets.UTF_8)

Again you avoid having to deal with checked exceptions.

gil.fernandes
  • 12,978
  • 5
  • 63
  • 76
6
private static String toHexadecimal(byte[] digest){
        String hash = "";
    for(byte aux : digest) {
        int b = aux & 0xff;
        if (Integer.toHexString(b).length() == 1) hash += "0";
        hash += Integer.toHexString(b);
    }
    return hash;
}
sdelvalle57
  • 877
  • 2
  • 10
  • 16
5

Following is the sample code safely converts byte array to String and String to byte array back.

 byte bytesArray[] = { 1, -2, 4, -5, 10};
 String encoded = java.util.Base64.getEncoder().encodeToString(bytesArray);
 byte[] decoded = java.util.Base64.getDecoder().decode(encoded);
 System.out.println("input: "+Arrays.toString(bytesArray));
 System.out.println("encoded: "+encoded);
 System.out.println("decoded: "+Arrays.toString(decoded));

Output:

input: [1, -2, 4, -5, 10]
encoded: Af4E+wo=
decoded: [1, -2, 4, -5, 10]
Shiv Buyya
  • 3,770
  • 2
  • 30
  • 25
4

This works fine for me:

String cd = "Holding some value";

Converting from string to byte[]:

byte[] cookie = new sun.misc.BASE64Decoder().decodeBuffer(cd);

Converting from byte[] to string:

cd = new sun.misc.BASE64Encoder().encode(cookie);
Aamir
  • 16,329
  • 10
  • 59
  • 65
LeD
  • 85
  • 1
  • 4
  • Never ever use `sun.` internal classes. Every Java tutorial since 1.0 will warn against it, and the new modular system even directly disallows it by default. – Maarten Bodewes Dec 27 '21 at 12:27
4

I did notice something that is not in any of the answers. You can cast each of the bytes in the byte array to characters, and put them in a char array. Then the string is

new String(cbuf)
where cbuf is the char array. To convert back, loop through the string casting each of the chars to bytes to put into a byte array, and this byte array will be the same as the first.

public class StringByteArrTest {

    public static void main(String[] args) {
        // put whatever byte array here
        byte[] arr = new byte[] {-12, -100, -49, 100, -63, 0, -90};
        for (byte b: arr) System.out.println(b);
        // put data into this char array
        char[] cbuf = new char[arr.length];
        for (int i = 0; i < arr.length; i++) {
            cbuf[i] = (char) arr[i];
        }
        // this is the string
        String s = new String(cbuf);
        System.out.println(s);

        // converting back
        byte[] out = new byte[s.length()];
        for (int i = 0; i < s.length(); i++) {
            out[i] = (byte) s.charAt(i);
        }
        for (byte b: out) System.out.println(b);
    }

}

Leonid
  • 708
  • 5
  • 12
2

javax.xml.bind.DatatypeConverter should do it:

byte [] b = javax.xml.bind.DatatypeConverter.parseHexBinary("E62DB");
String s = javax.xml.bind.DatatypeConverter.printHexBinary(b);
Tunaki
  • 132,869
  • 46
  • 340
  • 423
  • In newer versions of Java there is a `Base64` class included in `java.util`, and in the latest versions it can even handle hexadecimals directly (gasp!). – Maarten Bodewes Dec 27 '21 at 12:24
2
  byte[] bytes = "Techie Delight".getBytes();
        // System.out.println(Arrays.toString(bytes));
 
        // Create a string from the byte array without specifying
        // character encoding
        String string = new String(bytes);
        System.out.println(string);
Anand
  • 4,355
  • 2
  • 35
  • 45
1

Heres a few methods that convert an array of bytes to a string. I've tested them they work well.

public String getStringFromByteArray(byte[] settingsData) {

    ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(settingsData);
    Reader reader = new BufferedReader(new InputStreamReader(byteArrayInputStream));
    StringBuilder sb = new StringBuilder();
    int byteChar;

    try {
        while((byteChar = reader.read()) != -1) {
            sb.append((char) byteChar);
        }
    }
    catch(IOException e) {
        e.printStackTrace();
    }

    return sb.toString();

}

public String getStringFromByteArray(byte[] settingsData) {

    StringBuilder sb = new StringBuilder();
    for(byte willBeChar: settingsData) {
        sb.append((char) willBeChar);
    }

    return sb.toString();

}
user2288580
  • 2,210
  • 23
  • 16
1

While base64 encoding is safe and one could argue "the right answer", I arrived here looking for a way to convert a Java byte array to/from a Java String as-is. That is, where each member of the byte array remains intact in its String counterpart, with no extra space required for encoding/transport.

This answer describing 8bit transparent encodings was very helpful for me. I used ISO-8859-1 on terabytes of binary data to convert back and forth successfully (binary <-> String) without the inflated space requirements needed for a base64 encoding, so is safe for my use-case - YMMV.

This was also helpful in explaining when/if you should experiment.

Reed Sandberg
  • 671
  • 1
  • 10
  • 18
  • Why the hell would you store TB of data in a string, what's wrong with binary in the first place? What fucked up protocol or API would require the data as a string? – Maarten Bodewes Dec 27 '21 at 12:22
  • @MaartenBodewes, not TB in a single string buffer, more like a stream of data over time. Been a few years since this post, but I think this was to satisfy a requirement using Apache Ignite. Not something I'd generally recommend, but useful if you need it. – Reed Sandberg Dec 30 '21 at 10:03
0
import sun.misc.BASE64Decoder;
import sun.misc.BASE64Encoder;    

private static String base64Encode(byte[] bytes)
{
    return new BASE64Encoder().encode(bytes);
}

private static byte[] base64Decode(String s) throws IOException
{
    return new BASE64Decoder().decodeBuffer(s);
}
Feng Zhang
  • 1,698
  • 1
  • 17
  • 20
  • Why? Why would go through Base64 in order to convert a byte to a String? The overhead. – james.garriss Oct 01 '15 at 11:51
  • @james.garriss Because there is no need to go from an *unspecified* byte value to string for storage, in the end you would only need it to **communicate** or **display**. And generally, it is hard to communicate e.g. a backspace or other control character (if not an unmapped character) in any kind of text based protocol. You'd only convert if you know if the text is printable in some kind of encoding format (UTF-8, Latin 1 etc.). – Maarten Bodewes Dec 27 '21 at 12:19
  • Cannot resolve symbol 'BASE64Encoder' – Kishan Solanki Mar 22 '22 at 07:55
0

I succeeded converting byte array to a string with this method:

public static String byteArrayToString(byte[] data){
    String response = Arrays.toString(data);

    String[] byteValues = response.substring(1, response.length() - 1).split(",");
    byte[] bytes = new byte[byteValues.length];

    for (int i=0, len=bytes.length; i<len; i++) {
        bytes[i] = Byte.parseByte(byteValues[i].trim());
    }

    String str = new String(bytes);
    return str.toLowerCase();
}
lxknvlk
  • 2,744
  • 1
  • 27
  • 32
0

This one works for me up to android Q:

You can use the following method to convert o hex string to string

    public static String hexToString(String hex) {
    StringBuilder sb = new StringBuilder();
    char[] hexData = hex.toCharArray();
    for (int count = 0; count < hexData.length - 1; count += 2) {
        int firstDigit = Character.digit(hexData[count], 16);
        int lastDigit = Character.digit(hexData[count + 1], 16);
        int decimal = firstDigit * 16 + lastDigit;
        sb.append((char)decimal);
    }
    return sb.toString();
}

with the following to convert a byte array to a hex string

    public static String bytesToHex(byte[] bytes) {
    char[] hexChars = new char[bytes.length * 2];
    for (int j = 0; j < bytes.length; j++) {
        int v = bytes[j] & 0xFF;
        hexChars[j * 2] = hexArray[v >>> 4];
        hexChars[j * 2 + 1] = hexArray[v & 0x0F];
    }
    return new String(hexChars);
}
Miguel Tomás
  • 1,714
  • 1
  • 13
  • 23
  • OK, hex works, but you forgot to supply `hexArray`, and for some reason your methods are not symmetrical (hex string -> string, followed by byte[] -> hex string). – Maarten Bodewes Jan 03 '22 at 23:39
-1

Here the working code.

            // Encode byte array into string . TemplateBuffer1 is my bytearry variable.

        String finger_buffer = Base64.encodeToString(templateBuffer1, Base64.DEFAULT);
        Log.d(TAG, "Captured biometric device->" + finger_buffer);


        // Decode String into Byte Array. decodedString is my bytearray[] 
        decodedString = Base64.decode(finger_buffer, Base64.DEFAULT);
-1

You can use simple for loop for conversion:

public void byteArrToString(){
   byte[] b = {'a','b','$'};
   String str = ""; 
   for(int i=0; i<b.length; i++){
       char c = (char) b[i];
       str+=c;
   }
   System.out.println(str);
}
amoljdv06
  • 2,646
  • 1
  • 13
  • 18
-1
byte[] image = {...};
String imageString = Base64.encodeToString(image, Base64.NO_WRAP);
Fakhar
  • 3,946
  • 39
  • 35
-2

Read the bytes from String using ByteArrayInputStream and wrap it with BufferedReader which is Char Stream instead of Byte Stream which converts the byte data to String.

package com.cs.sajal;

import java.io.BufferedReader;
import java.io.ByteArrayInputStream;
import java.io.InputStreamReader;
import java.io.UnsupportedEncodingException;

public class TestCls {

    public static void main(String[] args) {

        String s=new String("Sajal is  a good boy");

        try
        {
        ByteArrayInputStream bis;
        bis=new ByteArrayInputStream(s.getBytes("UTF-8"));

        BufferedReader br=new BufferedReader(new InputStreamReader(bis));
        System.out.println(br.readLine());

        }
        catch(Exception e)
        {
            e.printStackTrace();
        }

    }
}

Output is:

Sajal is a good boy

Nissa
  • 4,636
  • 8
  • 29
  • 37
-2

You can do the following to convert byte array to string and then convert that string to byte array:

// 1. convert byte array to string and then string to byte array

    // convert byte array to string
    byte[] by_original = {0, 1, -2, 3, -4, -5, 6};
    String str1 = Arrays.toString(by_original);
    System.out.println(str1); // output: [0, 1, -2, 3, -4, -5, 6]

    // convert string to byte array
    String newString = str1.substring(1, str1.length()-1);
    String[] stringArray = newString.split(", ");
    byte[] by_new = new byte[stringArray.length];
    for(int i=0; i<stringArray.length; i++) {
        by_new[i] = (byte) Integer.parseInt(stringArray[i]);
    }
    System.out.println(Arrays.toString(by_new)); // output: [0, 1, -2, 3, -4, -5, 6]

But to convert the string to byte array and then convert that byte array to string, below approach can be used:

// 2. convert string to byte array and then byte array to string

    // convert string to byte array
    String str2 = "[0, 1, -2, 3, -4, -5, 6]";
    byte[] byteStr2 = str2.getBytes(StandardCharsets.UTF_8);
    // Now byteStr2 is [91, 48, 44, 32, 49, 44, 32, 45, 50, 44, 32, 51, 44, 32, 45, 52, 44, 32, 45, 53, 44, 32, 54, 93]

    // convert byte array to string
    System.out.println(new String(byteStr2, StandardCharsets.UTF_8)); // output: [0, 1, -2, 3, -4, -5, 6]
  • I've downvoted. The question doesn't specify what is in the byte array. Sure you can encode a byte array to string and decode that using your code, but a single call to a base64 encoding will create a more dense and (more importantly) standardized encoding. So 1. it doesn't really address the question, and 2. if it would address the question, then the encoding is suboptimal. It's also basically a "code only" answer as it doesn't describe the encoding format or why this would be beneficial. Explicit methods would be nice as well. – Maarten Bodewes Dec 27 '21 at 15:14
  • @MaartenBodewes In the question it was mentioned about what is in the byte array. I have answered the same. Please check the question bro. – Vinay Kumar P.V. Dec 27 '21 at 15:17
-4

A string is a collection of char's (16bit unsigned). So if you are going to convert negative numbers into a string, they'll be lost in translation.

Toad
  • 15,593
  • 16
  • 82
  • 128
  • 2
    -1: This is incorrect. While 'byte' is a signed type in Java, they are treated as unsigned by the library code that does character set encoding and decoding. – Stephen C Oct 08 '09 at 07:53
  • A fine example why having an unsigned 8 bit datatype really is a good idea to have in a language. Avoids unnecessary confusion ;^) – Toad Oct 08 '09 at 08:18
  • Be careful about assuming that a Java char will be 16 bits, because of Java's UTF-16, they can expand up to 32 bits – Joe Plante Aug 30 '12 at 19:44
  • @JoePlante No it isn't. Char is a 16 bit unicode character. Unicode doesn't expand up or down like utf. Source: http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html – Toad Aug 30 '12 at 19:57
  • 1
    @Toad actually yes, some Unicode characters when stored as UTF-16 take up two code points, i.e. 32 bits. The same happens in UTF-8: some characters use two/three/four code points, i.e. 16/24/32 bits. In fact, that's exactly what UTF is about (i.e. UTF != Unicode). – CAFxX Dec 01 '12 at 17:52
  • @cafxx so what would happen if i assign the first 'character' of a string which happens to be a 4 byte utf16 character to a variable declared as a char? – Toad Dec 03 '12 at 21:16
  • 1
    @Toad you'd get the first surrogate - i.e. only the first "half" of the character. Look at the docs for the [String.charAt](http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/String.html#charAt%28int%29) method and the [Character](http://docs.oracle.com/javase/6/docs/api/java/lang/Character.html) class. – CAFxX Dec 04 '12 at 12:52
  • @CAFxX wow... I stand corrected. Something what I would not have guessed. – Toad Dec 04 '12 at 14:53
-4
public class byteString {

    /**
     * @param args
     */
    public static void main(String[] args) throws Exception {
        // TODO Auto-generated method stub
        String msg = "Hello";
        byte[] buff = new byte[1024];
        buff = msg.getBytes("UTF-8");
        System.out.println(buff);
        String m = new String(buff);
        System.out.println(m);


    }

}