4

The Java limitation of MappedByteBuffer to 2GIG make it tricky to use for mapping big files. The usual recommended approach is to use an array of MappedByteBuffer and index it through:

long PAGE_SIZE = Integer.MAX_VALUE;
MappedByteBuffer[] buffers;

private int getPage(long offset) {
    return (int) (offset / PAGE_SIZE)
}

private int getIndex(long offset) {
    return (int) (offset % PAGE_SIZE);
}

public byte get(long offset) {
    return buffers[getPage(offset)].get(getIndex(offset));
}

this can be a working for single bytes, but requires rewriting a lot of code if you want to handle read/writes that are bigger and require crossing boundaries (getLong() or get(byte[])).

The question: what is your best practice for these kind of scenarios, do you know any working solution/code that can be re-used without re-inventing the wheel?

marcorossi
  • 1,941
  • 2
  • 21
  • 34
  • Integer.MAX_VALUE is not a power of 2 nor a multiple of the underlying page size unfortunately. (Which is usually something like 4KB) – Peter Lawrey Apr 15 '11 at 11:15
  • sorry, i didn't get your comment – marcorossi Apr 15 '11 at 11:30
  • Internally, it aligned DirectByteBuffers by page size, and I would have thought, using data not aligned by page would be less efficient, and I assume not allowed. (However I have tested it and it is) – Peter Lawrey Apr 15 '11 at 11:42
  • You can map in a larger than 2 GB block using underlying native methods directly (with reflection) however I haven't figured how to force it to perform writes to disk. I doubt this counts as best practice, but can be much faster. ;) – Peter Lawrey Apr 15 '11 at 12:37
  • Did you read the answers/comments from the last time you asked a similar question: http://stackoverflow.com/questions/5614206/buffered-randomaccessfile-java – Anon Apr 29 '11 at 13:23
  • yes, as you can see there's a comment by me to each of them. actually one yesterday. – marcorossi May 03 '11 at 10:08

1 Answers1

6

Have you checked out dsiutil's ByteBufferInputStream?

Javadoc

The main usefulness of this class is that of making it possible creating input streams that are really based on a MappedByteBuffer.

In particular, the factory method map(FileChannel, FileChannel.MapMode) will memory-map an entire file into an array of ByteBuffer and expose the array as a ByteBufferInputStream. This makes it possible to access easily mapped files larger than 2GiB.

  • long length()
  • long position()
  • void position(long newPosition)

Is that something you were thinking of? It's LGPL too.

Dawnkeeper
  • 2,844
  • 1
  • 25
  • 41
The Alchemist
  • 3,397
  • 21
  • 22
  • Amazing! Check it out here http://mvnrepository.com/artifact/it.unimi.dsi/dsiutils Sebstiano Vigna is magnifico! – aaiezza Jul 01 '15 at 03:48