While processing multiple gigabyte files I noticed something odd: it seems that reading from a file using a filechannel into a re-used ByteBuffer object allocated with allocateDirect is much slower than reading from a MappedByteBuffer, in fact it is even slower than reading into byte-arrays using regular read calls!
I was expecting it to be (almost) as fast as reading from mappedbytebuffers as my ByteBuffer is allocated with allocateDirect, hence the read should end-up directly in my bytebuffer without any intermediate copies.
My question now is: what is it that I'm doing wrong? Or is bytebuffer+filechannel really slowe r than regular io/mmap?
I the example code below I also added some code that converts what is read into long values, as that is what my real code constantly does. I would expect that the ByteBuffer getLong() method is much faster than my own byte shuffeler.
Test-results: mmap: 3.828 bytebuffer: 55.097 regular i/o: 38.175
import java.io.File;
import java.io.IOException;
import java.io.RandomAccessFile;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.channels.FileChannel.MapMode;
import java.nio.MappedByteBuffer;
class testbb {
static final int size = 536870904, n = size / 24;
static public long byteArrayToLong(byte [] in, int offset) {
return ((((((((long)(in[offset + 0] & 0xff) << 8) | (long)(in[offset + 1] & 0xff)) << 8 | (long)(in[offset + 2] & 0xff)) << 8 | (long)(in[offset + 3] & 0xff)) << 8 | (long)(in[offset + 4] & 0xff)) << 8 | (long)(in[offset + 5] & 0xff)) << 8 | (long)(in[offset + 6] & 0xff)) << 8 | (long)(in[offset + 7] & 0xff);
}
public static void main(String [] args) throws IOException {
long start;
RandomAccessFile fileHandle;
FileChannel fileChannel;
// create file
fileHandle = new RandomAccessFile("file.dat", "rw");
byte [] buffer = new byte[24];
for(int index=0; index<n; index++)
fileHandle.write(buffer);
fileChannel = fileHandle.getChannel();
// mmap()
MappedByteBuffer mbb = fileChannel.map(FileChannel.MapMode.READ_WRITE, 0, size);
byte [] buffer1 = new byte[24];
start = System.currentTimeMillis();
for(int index=0; index<n; index++) {
mbb.position(index * 24);
mbb.get(buffer1, 0, 24);
long dummy1 = byteArrayToLong(buffer1, 0);
long dummy2 = byteArrayToLong(buffer1, 8);
long dummy3 = byteArrayToLong(buffer1, 16);
}
System.out.println("mmap: " + (System.currentTimeMillis() - start) / 1000.0);
// bytebuffer
ByteBuffer buffer2 = ByteBuffer.allocateDirect(24);
start = System.currentTimeMillis();
for(int index=0; index<n; index++) {
buffer2.rewind();
fileChannel.read(buffer2, index * 24);
buffer2.rewind(); // need to rewind it to be able to use it
long dummy1 = buffer2.getLong();
long dummy2 = buffer2.getLong();
long dummy3 = buffer2.getLong();
}
System.out.println("bytebuffer: " + (System.currentTimeMillis() - start) / 1000.0);
// regular i/o
byte [] buffer3 = new byte[24];
start = System.currentTimeMillis();
for(int index=0; index<n; index++) {
fileHandle.seek(index * 24);
fileHandle.read(buffer3);
long dummy1 = byteArrayToLong(buffer1, 0);
long dummy2 = byteArrayToLong(buffer1, 8);
long dummy3 = byteArrayToLong(buffer1, 16);
}
System.out.println("regular i/o: " + (System.currentTimeMillis() - start) / 1000.0);
}
}
As loading large sections and then processing is them is not an option (I'll be reading data all over the place) I think I should stick to a MappedByteBuffer. Thank you all for your suggestions.