2

I have created a byte array of a file.

    FileInputStream fileInputStream=null;
File file = new File("/home/user/Desktop/myfile.pdf");
    byte[] bFile = new byte[(int) file.length()];
    try {
        fileInputStream = new FileInputStream(file);
    fileInputStream.read(bFile);
    fileInputStream.close();
    }catch(Exception e){
        e.printStackTrace();
    }

Now,I have one API, which is expecting a json input, there I have to put the above byte array in String format. And after reading the byte array in string format, I need to convert it back to byte array again. So, help me to find;

1) How to convert byte array to String and then back to the same byte array?

Aneesh
  • 153
  • 5
  • 18
  • You mean like this? http://stackoverflow.com/questions/1536054/how-to-convert-byte-array-to-string-and-vice-versa – Levenal Sep 20 '13 at 07:53
  • Your code above is wrong - note that `FileInputStream.read(...)` does not necessarily read the whole file in one go. It might read less than the whole file. You need to look at how many bytes are read, and keep calling `read()` in a loop until the whole file is read. – Jesper Sep 20 '13 at 07:55
  • Yes, I have updated it as you said. Replaced the code to read the whole file to byte array with a loop. – Aneesh Sep 20 '13 at 10:27

4 Answers4

4

The general problem of byte[] <-> String conversion is easily solved once you know the actual character set (encoding) that has been used to "serialize" a given text to a byte stream, or which is needed by the peer component to accept a given byte stream as text input - see the perfectly valid answers already given on this. I've seen a lot of problems due to lack of understanding character sets (and text encoding in general) in enterprise java projects even with experienced software developers, so I really suggest diving into this quite interesting topic. It is generally key to keep the character encoding information as some sort of "meta" information with your binary data if it represents text in some way. Hence the header in, for example, XML files, or even suffixes as parts of file names as it is sometimes seen with Apache htdocs contents etc., not to mention filesystem-specific ways to add any kind of metadata to files. Also, when communicating via, say, http, the Content-Type header fields often contain additional charset information to allow for correct interpretation of the actual Contents.

However, since in your example you read a PDF file, I'm not sure if you can actually expect pure text data anyway, regardless of any character encoding.

So in this case - depending on the rest of the application you're working on - you may want to transfer binary data within a JSON string. A common way to do so is to convert the binary data to Base64 and, once transferred, recover the binary data from the received Base64 string. How do I convert a byte array to Base64 in Java? is a good starting point for such a task.

Community
  • 1
  • 1
1

String class provides an overloaded constructor for this.

String s = new String(byteArray, "UTF-8");

byteArray = s.getBytes("UTF-8");

Providing an explicit encoding charset is encouraged because different encoding schemes may have different byte representations. Read more here and here.

Also, your inputstream maynot read all the contents in one go. You have to read in a loop until there is nothing more left to be read. Read the documentation. read() returns the number of bytes read.

Reads up to b.length bytes of data from this input stream into an array of bytes. This method blocks until some input is available

Community
  • 1
  • 1
rocketboy
  • 9,573
  • 2
  • 34
  • 36
  • A PDF file is not an UTF-8-encoded text file. This will not work. – Jesper Sep 20 '13 at 08:11
  • Ofcourse. UTF-8 is just for illustration. Maybe should have mentioned that explicitly. – rocketboy Sep 20 '13 at 08:33
  • 1
    It's not just about UTF-8. This whole idea does not work. A PDF file is not a text file, encoded with whatever character encoding. By using this `String` constructor, you're telling class `String` to interpret the data as if it is text. But the bytes of a PDF file are not text. – Jesper Sep 20 '13 at 09:56
  • Hi Jesper, you are right, its not working in the case of PDF. Could you please give me the solution for this issue? – Aneesh Sep 20 '13 at 10:36
  • String here is just a transient representation of encoded bytes, thats what I understand from OP's use case. It does not have to be words with meaning. Maybe I interpreted it incorrectly. – rocketboy Sep 20 '13 at 10:46
  • `String` objects are not suited for storing arbitrary bytes, even not if it's just temporary. The `String` constructor you propose to use converts the bytes to characters, using a character encoding. If there's a sequence of bytes in the input that's not valid in the character encoding, you'll get an exception or other problems (it will try to convert it into some replacement character). Either way, it's then not possible to get the original bytes back from the string. – Jesper Sep 20 '13 at 11:29
  • Agreed, and using a broader encoding can help there. Anyways, I totally understand what you are saying and pretty much agree. Thanks for pointing it out. – rocketboy Sep 20 '13 at 11:56
0

Convert byte array to String

 String s = new String(bFile , "ISO-8859-1" );

Convert String to byte array

 byte bArray[] =s.getBytes("ISO-8859-1");   
Prabhakaran Ramaswamy
  • 25,706
  • 10
  • 57
  • 64
  • A PDF file is not an UTF-8-encoded text file. This will not work. – Jesper Sep 20 '13 at 08:11
  • This will still not work, because a PDF file is also not an ISO-8859-1-encoded text file... It is not a text file at all, you cannot convert it to a `String` like this. – Jesper Sep 20 '13 at 08:25
0

String.getBytes() and String(byte[] bytes) are methods to consider.

Jean Logeart
  • 52,687
  • 11
  • 83
  • 118