4

Is there a way in Java to write to disk a large array of, say, integers? I am doing this on an Android, and have not found a method that comes anywhere close to native C code.

The resulting file need not be portable to different machines with different representations, so logically just a bulk write of the underlying bytes should be sufficient. But I don't know how to do that efficiently from Java.

I have tried searching the net, and tested the following:

  • Serialization - very slow, as expected.
  • Using NIO - still slow - Android trace reveals operations one at a time per integer:

Thanks in advance


NIO code:

int[] array = new array[10000000];

...

raf = new RandomAccessFile(ti.testFileName, "rw");
chan = raf.getChannel();
MappedByteBuffer out = chan.map(FileChannel.MapMode.READ_WRITE, 0, array.length*4);
ib = out.asIntBuffer();
ib.put(array);
out.force();
raf.close();
Mr_and_Mrs_D
  • 32,208
  • 39
  • 178
  • 361
Mesocyclone
  • 881
  • 1
  • 10
  • 19
  • http://stackoverflow.com/questions/2017868/java-writing-large-files – jmj Dec 19 '10 at 20:45
  • If this is an Android rather than Java, shouldn't your tags reflect this? – Peter Lawrey Dec 19 '10 at 20:45
  • http://stackoverflow.com/questions/1062113/fastest-way-to-write-huge-data-in-text-file-java – jmj Dec 19 '10 at 20:46
  • 1
    @org.life.java This link talks about writing text rather than binary which won't be a faster solution. – Peter Lawrey Dec 19 '10 at 20:48
  • @Peter These links are useful to OP, because he wants an array to be written on disk , he can have basic idea from those links – jmj Dec 19 '10 at 20:52
  • Writing 40MB of integers will take a minor eternity on flash, because flash itself is slow, and unpredictably slow (e.g., wear leveling). – CommonsWare Dec 19 '10 at 21:28
  • The first answer (minus the flip) indeed speeded up writes some (I would have upped it but don't have the rep to do so). However, reads are still very slow. Both are very slow compared to using JNI and C. – Mesocyclone Dec 20 '10 at 03:01
  • More specifically: For 1MB: W/R old way: .15/.84 secs new way:.05/.95 secs JNI: <.08/<.08. Can someone suggest read code that doesn't have the JVM converting bytes to ints (which is what is slowing it down)? BTW.. Droid-X is the platform under test. – Mesocyclone Dec 20 '10 at 03:11

4 Answers4

3

You said it was slow but the speed is likely to depend on speed of your disk subsystem. You should be able to write 40 MB to a regular disk in about half a second to commit to disk.

The following uses NIO and takes 665 ms to write and 62 ms on a workstation. The read and write shuffles the same amount of data around, but the read can take its data from the OS cache, the difference how long it takes to write to disk.

int[] ints = new int[10 * 1000 * 1000];
long start = System.nanoTime();

ByteBuffer byteBuffer = ByteBuffer.allocateDirect(ints.length*4+4);
byteBuffer.putInt(ints.length);
IntBuffer intBuffer = byteBuffer.asIntBuffer();
intBuffer.put(ints);
byteBuffer.position(0);

FileChannel fc = new FileOutputStream("main.dat").getChannel();
fc.write(byteBuffer);
fc.force(false);
fc.close();
long time = System.nanoTime() - start;
System.out.println("Write time " + time / 1000 / 1000 + " ms.");

long start2 = System.nanoTime();
FileChannel fc2 = new FileInputStream("main.dat").getChannel();
ByteBuffer lengthBuffer = ByteBuffer.allocate(4);
while(lengthBuffer.remaining()>0) fc2.read(lengthBuffer);
int length = lengthBuffer.getInt(0);

int[] ints2 = new int[length];
ByteBuffer buffer2 = ByteBuffer.allocateDirect(length*4);
while(buffer2.remaining()>0 && fc2.read(buffer2) > 0);
buffer2.flip();
buffer2.asIntBuffer().get(ints2);
long time2 = System.nanoTime() - start2;
System.out.println("Read time " + time2 / 1000 / 1000 + " ms.");

I have added the length to the start of the file so it doesn't have to be assumed. BTW: There was a bug in the write which I have fixed.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • As noted above, this did help some. Now I need a read solution. Tks – Mesocyclone Dec 20 '10 at 03:13
  • The read is still very slow. For this whole program: R/W: .70/.05 (avg). For random, R/W: .05/.000. The read time is all taken in .get – Mesocyclone Dec 21 '10 at 01:50
  • It must be your JVM or your hardware. On my machine the read takes 46 ms, the get takes 26 ms. Thats 1.5 GB/s transfer which is pretty decent. You may have to try different things to see what is most efficient for your JVM. I can assure you this is just one native method call on my JVM, no java loops. One solution might be to use the memory mapped IntBuffer inplace without copying it into the int[] In my case that wouldn't save much but it might on your system. – Peter Lawrey Dec 21 '10 at 10:48
  • 1
    My JVM is Android Dalvik, which does put in loops. I have decided to use JNI and C, as it seems the only way to do this quickly. Thanks. – Mesocyclone Dec 25 '10 at 04:15
1

I have no idea about the Android implementation, but in standard Java, good old-fashioned IO often outperforms NIO.

For example I believe the following code should be relatively fast if you have an array of bytes:

byte[] bytes = new byte[10000];
// ...
FileOutputStream out = new FileOutputStream(...);
try {
    out.write(bytes);
} finally {
    out.close();
}

Bear in mind that this will block until the entire array of bytes is written. But you don't say whether non-blocking behaviour is a problem or not.

Another thing you don't mention is how you intend to encode the integers when writing into the file. You need to perform the encoding in memory before writing to file, but it's possible that the array is too large to encode all at once, in which case you can encode/write in blocks of several hundred K.

Neil Bartlett
  • 23,743
  • 4
  • 44
  • 77
  • 1
    There is no problem moving bytes. The issue is using types other than bytes, which, even though they are in the right binary form, end up being converted, one element at a time, to the java form (i.e. the conversion in effect does nothing). The NIO solution avoids that for writes, but I don't know how to do it for reads. – Mesocyclone Dec 20 '10 at 03:13
  • I believe I answered this concern, if you care to read the whole answer. – Neil Bartlett Dec 20 '10 at 10:47
  • No, my whole concern was how to read/write arrays of integers quickly. I stated that format is not an issue - i.e. a simple dump to/from the file of the underlying array is adequate. However, I cannot get Java to do this. I can do it in C with the Java array, and it works fine and is fast (see comments above) – Mesocyclone Dec 21 '10 at 01:52
0

Peter,

When something seems too good to be true, it usually is. 89msecs to write 40MB of data suggests your HDD has a bandwidth of much larger 500MB/sec (since you also included the time to open and close the file). That is unlikely to be true. Did you check the file in fact is of size 40MB. Also, I would suggest that you init the buffer to see the file content are not all zeros. May be an untouched buffer is just skipped. Whatever it is, the number you have is too good to be true.

Thanks.

Virtually Real
  • 1,572
  • 4
  • 16
  • 20
  • BTW, in your code, I think the flip is the problem. If you remove it you will perhaps see real output. I am betting as is, your output file is 0 bytes. – Virtually Real Dec 19 '10 at 22:57
0

Consider looking at buffering your outputstream

Thorbjørn Ravn Andersen
  • 73,784
  • 33
  • 194
  • 347