0

My goal is to read a stream of bytes from the socket into a file, and then play it back at a later time as test harness for my application. Somewhere in writing bytes to disk, a byte will get written incorrectly, seemingly at random.

My writer looks like this:

blobWriter = new BufferedOutputStream(new FileOutputStream(blobFileName));  
blobChannel = Channels.newChannel(blobWriter);

I'm using a blobChannel so that I can write directly from a ByteBuffer. On each read of the socket, I simply pass the buffer to the writer:

if (key.isReadable()) {
    final int bytesRead= socketChannel.read(readBuffer);

    if(bytesRead == -1)
    {
        logger.warn("no bytes to read");
        break;
    }

    readBuffer.flip();
    blobChannel.write(readBuffer);
    ... 
    <continue to process data>
}

When the feed is live, the program processes reads into records, and they are not corrupt. Say for each message, it outputs a tuple of 7 fields. One of them, for example, is this:

(tupleid=0,msgType=110,feedId=225,venueId=30,orderId=160,symbol="CHF.NOK.SPOT",venueTime=44417979)

When instead of a live connection to the market, I hook the application to a reader that plays the same data back from disk, the processed output goes haywire:

(tupleid=0,msgType=110,feedId=225,venueId=30,orderId=160,symbol="CHF.-�ûnX",venueTime=44417979)

Notice the corrupt symbol.

The weirdest thing is that it will process thousands of messages with the same symbol and other fields no problem, but then inexplicably one message gets corrupted. It's not always the symbol field that is incorrect, sometimes the orderId is wrong etc...

I suspect that blobWriter is miswriting on occasion. Could my OS (windows 7) is doing something funky? I've inspected the bytestream that is saved to disk in notepad++, and indeed it shows the incorrect bytes, so the error must be in the file writer, not in my playback mechanism. Furthermore, if the main application itself was buggy, it should misread bytes on the live feed; it doesn't.

Does anyone know what could possibly be going wrong?

Adam Hughes
  • 14,601
  • 12
  • 83
  • 122
  • Where is the output displayed ? – ps-aux Nov 18 '16 at 20:01
  • Which output specifically? – Adam Hughes Nov 18 '16 at 20:02
  • 'the processed output' you mentiond. – ps-aux Nov 18 '16 at 20:07
  • The two tuple records I showed are an example of processed output. The first, with correct fields, is when the processing is done on a live market. The second, with a bad byte, is an example of processing from data saved to disk. I'm thinking writeBufferChannel may be the problem, as the server is in non-blocking mode and the javadoc mentions this can be a problem – Adam Hughes Nov 18 '16 at 20:10
  • EJP, is this really a duplicate, because I read that question before posting? If so, what would I change in my writer to ensure proper behavior? By the way, it seems like my channel is writing more bytes that my readers is actually reading from buffer. – Adam Hughes Nov 18 '16 at 20:29
  • It is really a duplicate because the solution appears in my answer there. Your copy code is wrong. – user207421 Nov 18 '16 at 21:57
  • Ok, I thought I had tried your solution and it didnt' work but I wasn't using buffer.compact(). Let me implement this on monday and if it works, I will close down thread. – Adam Hughes Nov 18 '16 at 22:16
  • EJP, I reimplemented my code without using Channels, and still am seeing the same issue. Perhaps the problem is more clear in this new thread? http://stackoverflow.com/questions/40725665/java-bytes-to-file-are-they-miswritten – Adam Hughes Nov 21 '16 at 17:12

1 Answers1

0

Looks like the WritableByteChannel javaDoc was trying to warn me:

Unless otherwise specified, a write operation will return only after writing all of the r requested bytes. Some types of channels, depending upon their state, may write only some of the bytes or possibly none at all. A socket channel in non-blocking mode, for example, cannot write any more bytes than are free in the socket's output buffer.

Indeed the socket channel is in non-blocking mode. I profiled this and one in every 200 reads or so is out of sync. It seems like my channel is writing more bytes that my readers is actually reading from buffer.

Another SO thread offers more information.

What finally worked for me was copying the bytearray on each read instead of using a stream. This is probably not performance optimal, but at least it doesn't result in wrong data:

if (readBuffer.hasRemaining()){
    byte[] b = new byte[readBuffer.remaining()]
    readBuffer.get(b)
    blobWriter.write(b)
}
Community
  • 1
  • 1
Adam Hughes
  • 14,601
  • 12
  • 83
  • 122
  • Copying it into a byte array should not be necessary. A simple `while (readBuffer.remaining() > 0) blobWriter.write(readBuffer);` should be OK. But if I understood this correctly, you have really experienced a case where the answer to my question that you linked to is *"No, it will not always write the whole buffer"* - is this correct? – Marco13 Nov 19 '16 at 01:27
  • Thanks Marco, I believe so. – Adam Hughes Nov 21 '16 at 01:16
  • I tried this Marco, but still getting misreads. I opened a separate thread, as I am having the same problem and am no longer using channels: http://stackoverflow.com/questions/40725665/java-bytes-to-file-are-they-miswritten – Adam Hughes Nov 21 '16 at 17:11