1

I have a messaging class in my java program that only uses String values and never any binary data.

I want to send a rpm file, so basically binary data through this messaging class to a receiver.

I know this can be done by converting the binary data to a String on the messaging end and then back to a binary file on the receiving end.

However my question is, will any data be lost between converting my binary file to a String then back to binary data to save as a binary file, or will the data be retained through all conversions?

jgr208
  • 2,896
  • 9
  • 36
  • 64
  • That depends; how are you intending to *encode* the binary data? – Elliott Frisch May 18 '16 at 14:38
  • Provided that you don'y use some exotic implementation, you won't lose anything. Have a look at Base64 for instance, that enables you to convert byte arrays to String and vice versa : https://commons.apache.org/proper/commons-codec/apidocs/org/apache/commons/codec/binary/Base64.html – Arnaud May 18 '16 at 14:39
  • @ElliottFrisch i never thought of that part of the design yet, what are my options? – jgr208 May 18 '16 at 14:39
  • @Berger that is an excellent library you linked too, however in my project we do not have apache commons. Can I still do this manually in java? – jgr208 May 18 '16 at 14:40
  • To add to Elliott's comment, the basic idea here is that data may be lost if you choose an encoding which can't handle the range of characters in your data. – Tim Biegeleisen May 18 '16 at 14:40
  • @TimBiegeleisen well its an `rpm` file on RHEL for a 32 bit arch. Am not sure then what might be the best encoding to use. – jgr208 May 18 '16 at 14:42
  • @jgr208 : look here for probably available implementations, or even custom ones : http://stackoverflow.com/questions/469695/decode-base64-data-in-java – Arnaud May 18 '16 at 14:43
  • +1 I work with RHEL files (and other CAD files). Actually, I work with code which allows uploading and viewing these files in Java. – Tim Biegeleisen May 18 '16 at 14:44
  • 1
    Can you modify the messaging class? Why does it support only Strings if you intend to send non-text data? – Kayaman May 18 '16 at 14:50
  • @Kayaman no I can not that classes design is set in stone. It was a requirement that changed. – jgr208 May 18 '16 at 14:51
  • 1
    Isn't it amazing how designs are always "set in stone", but requirements always change? What would you do if it was completely impossible to achieve what you want? – Kayaman May 18 '16 at 14:55

1 Answers1

2

Binary data means byte[], InputStream, OutputStream. And java uses internally Unicode for text: String, char, Reader, Writer.

Hence one should only convert binary data that represents text, and also specify the encoding of that binary data:

byte[] bytes = ...
String s = new String(bytes, StandardCharsets.UTF_8);
bytes = s.getBytes(StandardCharsets.UTF_8);

Non-text data should not be converted, as it may be illegal for the specific encoding, especially for the multibyte encoding UTF-8. Also the conversion to Unicode is an unnecessary inefficiency. For instance java char is two bytes (UTF-16 encoded).

Better use a ByteArrayInputStream, ByteArrayOutputStream, ByteBuffer for some purposes. Never String. When obstinate, then use StandardCharsets.ISO_8859_1.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
  • I should have mentioned **Base64** to encode the binary data to text, it pumps up the size and speed to 4/3 and then a char is 2 bytes, so normally 8/3. For an **rpm** this seemed too bad. – Joop Eggen Nov 01 '22 at 08:56