0

I'm trying to process a file int-wise efficiently. I do this by firstly mapping the file into memory by using a MappedByteBuffer:

public static ByteBuffer getByteBuffer(String filePath, int start, long size) throws IOException
{
    File binaryFile = new File(filePath);
    FileChannel binaryFileChannel = new RandomAccessFile(binaryFile, "r").getChannel();

    return binaryFileChannel.map(FileChannel.MapMode.READ_ONLY, start, size);
}

Then, I'm calling "the usual" getInt() till the entire file is processed.

In order to further speed this up, I discovered the Unsafe class which has been explained in this context quite well here. Since it uses native means, it is supposed to be faster than the "safe" Buffer classes. I wrote the following test code:

import sun.misc.Unsafe;
import sun.nio.ch.DirectBuffer;

import java.lang.reflect.Field;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;

public class UnsafeTesting
{
    private ByteBuffer destinationByteBuffer;
    private ByteBuffer sourceByteBuffer;

    private UnsafeTesting()
    {
        int capacity = 1_000_000_000;
        destinationByteBuffer = allocateByteBuffer(capacity);
        sourceByteBuffer = allocateByteBuffer(capacity);

        for (int byteBufferIndex = 0; byteBufferIndex < sourceByteBuffer.capacity() - 3; byteBufferIndex += 4)
        {
            sourceByteBuffer.putInt(byteBufferIndex);
        }

        destinationByteBuffer.clear();
        sourceByteBuffer.clear();
    }

    private ByteBuffer allocateByteBuffer(int capacity)
    {
        return ByteBuffer.allocateDirect(capacity).order(ByteOrder.nativeOrder());
    }

    private void runTest(boolean useUnsafeMethod) throws Exception
    {
        Unsafe unsafe = getUnsafeInstance();
        long destinationByteBufferAddress = ((DirectBuffer) destinationByteBuffer).address();
        long sourceByteBufferAddress = ((DirectBuffer) sourceByteBuffer).address();

        int executionsCount = 0;

        if (useUnsafeMethod)
        {
            for (int sourceBufferIndex = 0; sourceBufferIndex < destinationByteBuffer.remaining() - 3; sourceBufferIndex += 4)
            {
                long sourceOffset = sourceByteBufferAddress + sourceBufferIndex;
                int value = unsafe.getInt(sourceOffset);

                long targetOffset = destinationByteBufferAddress + sourceBufferIndex;
                unsafe.putInt(targetOffset, value);

                executionsCount++;
            }
        } else
        {
            while (sourceByteBuffer.remaining() > 3)
            {
                int value = destinationByteBuffer.getInt();
                sourceByteBuffer.putInt(value);

                executionsCount++;
            }
        }

        boolean equal = sourceByteBuffer.equals(destinationByteBuffer);

        if (!equal)
        {
            throw new IllegalStateException("Buffers not equal!");
        }

        System.out.println("Executions: " + executionsCount);
    }

    private static Unsafe getUnsafeInstance() throws Exception
    {
        Field unsafe = Unsafe.class.getDeclaredField("theUnsafe");
        unsafe.setAccessible(true);

        return (Unsafe) unsafe.get(null);
    }

    private static void runTest(UnsafeTesting unsafeTesting, boolean useUnsafeMethod) throws Exception
    {
        long startingTime = System.nanoTime();
        unsafeTesting.runTest(useUnsafeMethod);
        long nanoSecondsTaken = System.nanoTime() - startingTime;
        double milliSecondsTaken = nanoSecondsTaken / 1e6;
        System.out.println(milliSecondsTaken + " milliseconds taken");
    }

    public static void main(String[] arguments) throws Exception
    {
        UnsafeTesting unsafeTesting = new UnsafeTesting();

        System.out.println("### Unsafe ###");
        runTest(unsafeTesting, true);
        System.out.println();

        System.out.println("### Direct ###");
        runTest(unsafeTesting, false);
    }
}

The output is something like the following:

### Unsafe ###
Executions: 250000000
1687.07085 milliseconds taken

### Direct ###
Executions: 250000000
657.23575 milliseconds taken

Why is the unsafe approach slower? One idea is that it doesn't process the data sequentially due to the direct addressing. Maybe there is a way to process the buffer sequentially using Unsafe so it can be faster still? Using an IntBuffer is about as fast as a ByteBuffer so not much is gained from that.

Community
  • 1
  • 1
BullyWiiPlaza
  • 17,329
  • 10
  • 113
  • 185
  • 2
    When you compare the buffers with `sourceByteBuffer.equals(destinationByteBuffer)` after having performed the copy using ByteBuffer methods, no work is actually done: the position of both buffers is at the end of the buffer. That's where most of the time is spent in the "unsafe" case. To make it actually compare things, you should `...position(0)` both buffers prior to comparing them. In fact, you should probably not include that into timing, and then you'll see that unsafe is faster. On my machine it's ~200 ms for the unsafe case vs ~600 ms for the byte buffer case (measured using your code). – starikoff May 12 '17 at 11:11
  • @starikoff: Thank you for the `equals()` correction. Why does the performance of `Unsafe` vary so much? It seems pretty inconsistent. Now it's also faster for me but not always... – BullyWiiPlaza May 13 '17 at 12:22

0 Answers0