0

I have persistent problems with networking in java. It might because I'm used to operate upon bytes, which are for some reason signed in Java (I take it as a proof of existence of the ultimate Evil).

I have been struggling to send binary data using some packet system. Originally, I wanted to create data myself, but Java is not designed to operate with bytes. Instead, it's advised that you send serialised object.

So I looked up how to create a serialisable class ChatPacket that I will send over network.

This my code derived from answer similar to the one I linked above (I couldn't found the original one):

//Convert packet to raw data
public byte[] getBytes() {
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    ObjectOutput out = null;
    try {
      //Pass the stream refference to the weird object output stream thing
      out = new ObjectOutputStream(bos);
      //Save the packet into byte stream
      out.writeObject(this);
      //Create separate arrays of bytes
      byte[] data = bos.toByteArray();     
      //Funny way to convert int in array of 4 bytes. The Java Way...
      byte[] size = ByteBuffer.allocate(4).putInt(data.length).array();
      //The final data array - feel free to sugest more efficient way to merge the two arrays
      byte[] data_all = new byte[data.length+4];
      //Merge the arrays of bytes - I found no other way than creating third array.
      for(int i=0; i<data_all.length; i++) {
          //Save the size
          if(i<4) {
             data_all[i] = size[i];
          }
          //Save the data
          else {
             data_all[i] = data[i-4];  
          }
      }
      //Debug output
      Log.debug("Sending packet of "+data.length+" bytes");
      for(int i=0; i<data_all.length; i++) {
          Log.debug(i+": Sending character: "+((char)data_all[i]))+" ["+(int)data_all[i]+"]");
      }

Now the last debug lines produce a nice output:

Sending packet of 262 bytes
0: Sending character: 0 [0]     -  
1: Sending character: 0 [0]     |  The int - size of the packet
2: Sending character: 1 [1]     |
3: Sending character: 6 [6]     -
4: Sending character: -84 [-84] -
5: Sending character: -19 [-19] |
6: Sending character: 0 [0]     |  The data generated within the serialisation function
7: Sending character: 5 [5]     |
8: Sending character: 115 [115] |
9: Sending character: 114 [114] ...
10: ...

There are numbers instead of actual chars. Forgive me that - I have fixed the function after adding the code and was lazy to rewrite it.

So far so good. Now let's see what arrives on the other side. I'm receiving data within a loop where I'm calling read method upon some buffered stream:

while((character=in.read())!=-1) 
    ...

I have created a very similar output to the one you can see above, just that it now prints the data received.

1: Received character: \0 [0] -
2: Received character: \0 [0] |  The size again
3: Received character:  [1]   |
4: Received character:  [6]   -
Expecting 262 bytes of data!     -
1: Received character: � [65533] |
2: Received character: � [65533] | The serialised data...
3: Received character: \0 [0]    |
4: Received character:  [5]     ...
5: ...

As you can see, instead of overflowing or something like that, the negative numbers have somehow turned into huge values.
I can's see what's actually being sent over the network - I've been already asking about that but the answer really didn't help, as you can see.

Basically my question regarding this is now:

  1. How is it possible, that the negative values can pass over the network as bytes - unless they are already broken when sent...
  2. How can I receive such values - the values in debug above are returned by read method...

In the end, I need to recontsruct the byte array back. So also I need to have values thich can be looselesly converted into bytes.

private List<Byte> buffer = new ArrayList<>();
public boolean receiveChar(int current) {
    buffer.add((byte)current);
    ... some stuff
}
Community
  • 1
  • 1
Tomáš Zato
  • 50,171
  • 52
  • 268
  • 778
  • `InputStream#read()`returns an `int` if memory serves... What happens if you cast the received `int` to a `byte`? – awksp Jun 03 '14 at 21:10
  • I don't really know. But since the `read()` itself already returns wrong value (please see my edit) it really doesn't matter what happens later on. – Tomáš Zato Jun 03 '14 at 21:13
  • It's still odd to me, because `65533` is out of the range for a `byte`... Try `character = ((byte)in.read())`, and make `character` a `byte`? Don't think that'll help, but at least the debug log will make more sense – awksp Jun 03 '14 at 21:23
  • What you suggest can be performed anytime later on. And since the values remained broken after them being cast to `byte` and stacked in `List` I don't suppose this is the way. Rather, I'm now trying to find a `read` overload that is not retarded. – Tomáš Zato Jun 03 '14 at 21:27
  • Yes, it could, but I would at least like to see log entries that make sense, but if you insist... I take it it's only the negative numbers that are broken? Do all the negative values come out as `65533`? – awksp Jun 03 '14 at 21:29
  • All I read is `meh Java sucks in every regard with bytes`. You are using it obviously wrong when just patchworking some Java snippets together. You also didn't even mention how you send things (TCP? UDP? MemoryFiles?) and how your `ChatPacket` class looks like. – Thomas Jungblut Jun 03 '14 at 22:14
  • 'Java is not designed to operate with bytes' is ridiculously false. The 'Java way' to send an int is writeInt(). – user207421 Jun 03 '14 at 22:18
  • In multithreading you can hardly just write int to the output stream from wherever you want. You must buffer output in some queue. And I use byte arrays for that queue. – Tomáš Zato Jun 04 '14 at 00:09
  • 65533 is 0xFFFD, which is the [unicode replacement character](http://en.wikipedia.org/wiki/Specials_(Unicode_block)#Replacement_character). I suspect that you're taking your `byte`s and converting them to `chars` using a reader on receiving. If the bytes don't happen to decode to a valid code point (which, not all byte sequences do in all encodings), the reader will instead output U+FFFD. – yshavit Jun 04 '14 at 15:34
  • On looking closer, I'm even more confident that's what's happening. -84 and -19 are 0xAC and 0xED. Both of those are continuation bytes in UTF-8; they can only follow bytes that start a multibyte sequence, which the preceding byte (6, or 0x06) is _not_. That means they're invalid bytes in this sequence in UTF-8, and would thus be decoded as the replacement char. It sounds like you need to spend some time to better understand the relationships between bytes, chars and charsets. – yshavit Jun 04 '14 at 16:08
  • Yes, the problem was really caused by the buffered reader (which I used to read the data) interpreting these bytes instead of reading them directly. That happened because I just happened to pick a java tutorial that uses this reader without precisely explaining what it does. And yes, my understanding of encoding is very poor at the moment. In my head, it's still in drawer labeled "*Magic*". – Tomáš Zato Jun 04 '14 at 16:13
  • Did you read the article I linked to in a previous question of yours? The article's tone is a bit aggressive/annoying to my taste, but it contains good info. http://www.joelonsoftware.com/articles/Unicode.html if you need it again – yshavit Jun 04 '14 at 16:15
  • Thank you a lot and sorry for wasting your time with unanswerable questions. – Tomáš Zato Jun 04 '14 at 16:19
  • No prob, that's what the site's here for! – yshavit Jun 04 '14 at 16:20

0 Answers0