2

I have a need to send and receive large byte array over internet(http restful service).

the simplest way I can think of is to convert the byte array into string.

I searched around and found this post Java Byte Array to String to Byte Array

I had the follow code to verify the accuracy of the transformation.

    System.out.println("message");
    System.out.println (message);

    String message = "Die Strahlengriffelgewächse stammen...";

    byte[] pack = Fbs.packExce(message);    
    System.out.println ("pack");
    System.out.println (pack);
    System.out.println ("packlenght:" + pack.length);

    String toString = new String(pack);
    System.out.println ("toString");
    System.out.println (toString);

    byte[] toBytes = toString.getBytes();
    System.out.println ("toBytes");
    System.out.println (toBytes);
    System.out.println ("toByteslength:" +toBytes.length);

the "Fbs.packExce()" is a method of taking in large chunk of string and churning out byte array of large size.

I changed the length of the message, checked and printed out the length of byte arrays before converting to string and after converting back.

I got the following results:

...
pack
[B@5680a178
packlenght:748
...
toBytes
[B@5fdef03a
toByteslength:750

----------------------

...
pack
[B@5680a178
packlenght:1016
...
toBytes
[B@5fdef03a
toByteslength:1018

I had omitted the "message" since it is too long.

8 times out of 10, I can see that the derived byte array(the new one, saying "toBytes") is longer by 2 bytes than the original byte array ( the "pack")

I said 8 of 10, because there were also scenarios when the length are the same between the derived and the original, see below

...
pack
[B@5680a178
packlenght:824
toString
...
toBytes
[B@5fdef03a
toByteslength:824       
...

I can not figure out the exact rules.

does anyone has any idea?

or are there any better ways of converting byte array to and from string?

cheers

Community
  • 1
  • 1
George Wang
  • 765
  • 2
  • 13
  • 28
  • https://docs.oracle.com/javase/8/docs/api/java/lang/String.html#String-byte:A- *"The behavior of this constructor when the given bytes are not valid in the default charset is unspecified."* did you consider that? –  Jul 08 '16 at 11:04
  • It's the Umlaut ... Use Base64 encoding. – Fildor Jul 08 '16 at 11:04
  • Could you print the Strings as well? I'd suspect some encoding problems since you're not passing any explicit encoding to `String#getBytes()` (it thus uses the system's default encoding) and I don't know what `Fbs#packExce()` is doing. – Thomas Jul 08 '16 at 11:04
  • the message was too long, any way, I just randomly copy, past from wiki pages – George Wang Jul 08 '16 at 11:21
  • public static byte[] packExce(String text){ FlatBufferBuilder builder = new FlatBufferBuilder(0); int textOffset = builder.createString(text); Exce.startExce(builder); Exce.addText(builder, textOffset); int exce = Exce.endExce(builder); Bucket.startBucket(builder); Bucket.addContentType(builder, Post.Exce); Bucket.addContent(builder, exce); int buck = Bucket.endBucket(builder); builder.finish(buck); return builder.sizedByteArray(); Base64.getMimeEncoder().encodeToString(buf.array()); } – George Wang Jul 08 '16 at 11:30
  • this is actually the application of flatbuffers binary protocol – George Wang Jul 08 '16 at 11:31
  • see [how to convert byte array to string and vice versa](http://stackoverflow.com/questions/1536054/how-to-convert-byte-array-to-string-and-vice-versa) and [Java Byte Array to String to Byte Array](http://stackoverflow.com/questions/1536054/how-to-convert-byte-array-to-string-and-vice-versa) – jan.supol Jul 08 '16 at 12:04

3 Answers3

7

the simplest way I can think of is to convert the byte array into string.

The simplest way is the wrong way. For most character encodings, converting an arbitrary byte sequence to a text is likely to be lossy.

A better (i.e. more robust) way is to use Base64 encoding. Read the javadoc for the Base64 class and its dependent encode and decoder classes.


If you do persist in trying to convert arbitrary bytes top characters and back using new String(byte[]) and the like:

  • Be sure that you chose a character encoding where a Bytes -> Characters -> Bytes conversion sequence is not lossy. (LATIN-1 will work)

  • Don't rely on the current execution platform's default character encoding for the encoding / decoding charset.

  • In a client / server system, the client and server have to use the same encoding.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • ... and of course, the client needs to know about the encoding being used, too. – meriton Jul 08 '16 at 11:19
  • @Fildor - For JDK < 8 : upgrade to Java 8 now. :-). Java 7 was EOL'd over a year ago, and earlier releases were EOL'd years ago. – Stephen C Jul 08 '16 at 11:28
  • Just wanted to name the possibility. Personally I am on Java 8. But that doesn't mean the OP always has a choice in Java version. – Fildor Jul 08 '16 at 11:33
  • And I just wanted to plant the idea that the OP or other readers on Java 7 and earlier *should* upgrade. Anyone stuck on an old Java release is liable to get burned eventually. – Stephen C Jul 08 '16 at 11:39
  • as to base64, there are 3 varient basic, url and mime, which do you recommend. – George Wang Jul 08 '16 at 11:52
  • as a matter of fact base63 mime had been my first implementation. ByteBuffer buf = ByteBuffer.wrap(Base64.getMimeDecoder().decode(bucket)); ... ... ByteBuffer buf = builder.dataBuffer(); return Base64.getMimeEncoder().encodeToString(buf.array()); yet somehow it seems that base64 mime mis-covert the binary data even within the save java vm. do you have any idea ? – George Wang Jul 08 '16 at 11:54
  • Ask a new question, including an MVCE that show this problem. MCVE means executable by us. – Stephen C Jul 08 '16 at 12:22
2

I have a need to send and receive large byte array over internet(http restful service).

the simplest way I can think of is to convert the byte array into string.

If that's all about sending/receiving byte array with jaxrs, each jaxrs implementation is perfectly capable of transmitting byte[]. See specification, section 4.2.4.

jan.supol
  • 2,636
  • 1
  • 26
  • 31
  • well, you might not make any assumptions on the context. yes, jaxrs is part of the picture. yet there still other parts, say, soap, swift, so, the the safest way is to change to string, and then I do not expect any adaptation along the way to destination – George Wang Jul 08 '16 at 11:39
1

as per suggestion by Stephen C, I turned to Base64 basic mode:

following are my current complete verification code:

    String message = "Die Strahlengriffelgewächse stammen ... ...
    System.out.println("message");
    System.out.println (message);

    byte[] pack = Fbs.packExce(message);    
    System.out.println ("pack");
    System.out.println (pack);
    System.out.println ("packlenght:" + pack.length);


    String toString = Base64.getEncoder().encodeToString(pack);
    System.out.println ("toString");
    System.out.println (toString);


    byte[] toBytes = Base64.getDecoder().decode(toString);
    System.out.println ("toBytes");
    System.out.println (toBytes);
    System.out.println ("toByteslength:" +toBytes.length);


    String toBytesExtraction = extractExce(toBytes);
    System.out.println ("toBytesExtraction");
    System.out.println (toBytesExtraction);

    String extraction = extractExce(pack);
    System.out.println ("extraction");
    System.out.println (extraction);


public static byte[] packExce(String text){

    FlatBufferBuilder builder = new FlatBufferBuilder(0);

    int textOffset = builder.createString(text);

    Exce.startExce(builder);
    Exce.addText(builder, textOffset);
    int exce = Exce.endExce(builder);

    Bucket.startBucket(builder);
    Bucket.addContentType(builder, Post.Exce);
    Bucket.addContent(builder, exce);       
    int buck = Bucket.endBucket(builder);

    builder.finish(buck);

    return builder.sizedByteArray();
    //ByteBuffer buf = builder.dataBuffer();
    //return buf;
    //return Base64.getMimeEncoder().encodeToString(buf.array());
}
private String extractExce(byte[] bucket ){

    String message = null;

    ByteBuffer buf = ByteBuffer.wrap(bucket);
    Bucket cont = Bucket.getRootAsBucket(buf); 
    System.out.println (cont.contentType());
    if (cont.contentType() == Post.Exce){
        message = ((Exce)cont.content(new Exce())).text();

    }
   return message; 
}

and it seems work for my purpose:

...
pack
[B@5680a178
packlenght:2020
...
toBytes
[B@5fdef03a
toByteslength:2020
'''
----------------------

...
pack
[B@5680a178
packlenght:1872
...

toBytes
[B@5fdef03a
toByteslength:1872
...

and both extraction respectively from "toBytes" and "pack" faithfully restored the original "message"

String toBytesExtraction = extractExce(toBytes);
String extraction = extractExce(pack);

as a matter of fact, what I did not mention is that my original implementation had been base64 mime. my start point had been ByteBuffer then (my current is byte[]).

following are my code snippets if you are interested in.

coder

...
ByteBuffer buf = builder.dataBuffer();
return Base64.getMimeEncoder().encodeToString(buf.array());

decoder

ByteBuffer buf = ByteBuffer.wrap(Base64.getMimeDecoder().decode(bucket));

my guess is that the problem might have come from base64.mime.

because my first step of trouble location had been removing base64.mime, and using ByteBuffer directly. and it was a success...

well, I am a bit wandering off.

Back to the topic, I am still having no idea about the "2 bytes vary" regarding byte arrays before and after converting by "new String(byte[]) and "String.getBytes()" ...

cheers

George Wang
  • 765
  • 2
  • 13
  • 28
  • *"I am still having no idea about the '2 bytes vary'"*. You would need to do a detailed (byte by byte) comparison of the bytes to determine what has caused the discrepancy. It could be in any of the 1000 or so characters of your original input string. It is probably a lossy encode / decode problem, as I described. – Stephen C Jul 09 '16 at 04:50