2

I have binary data in a file (a list of 32-bit integer values) that I need to get into an int array efficiently. I can't see any way to do this other than loading the data into a byte[] and then converting it into an int[] one element at a time. This is too slow for the amount of data I need to load. It takes about 3 seconds to do the conversion on an actual phone. Reading the data from the file in the byte[] is pretty much instantaneous.

Are there any libraries that use native methods for reading an int[] from a file, or for converting a byte[] to int[]?

Chris
  • 443
  • 8
  • 16
  • Are you sure it is the conversion that takes most of the time? I'm not convinced. I'd look at I/O first. – Vladimir Dyuzhev Apr 22 '11 at 01:04
  • 1
    Yes, I have logging around the operations and the iteration through the array is where all the time is. In fact I have solved the problem by writing a native method to do the manipulation. In C++ it is a simple cast to convert the data. – Chris Apr 23 '11 at 04:08
  • Hey Chris! Do you think you might paste the code here, or maybe on pastebin.com, and post a link to it here? I believe this small piece of code might be a nice enhancement for Android developers. Did you also try the NIO package as I wrote, and if so: Do you have some performance comparisons? I would be very interested in them. :) – mreichelt Apr 25 '11 at 15:19

3 Answers3

3

Have you tried something like this yet:

File file = new File("binary.file");
FileInputStream fin = new FileInputStream(file);
BufferedInputStream bin = new BufferedInputStream(fin);
DataInputStream din = new DataInputStream(bin);

int count = (int) (file.length() / 4);
int[] values = new int[count];
for (int i = 0; i < count; i++) {
    values[i] = din.readInt();
}

Even on a phone this should be relatively fast, unless you're just dealing with a huge file.

WhiteFang34
  • 70,765
  • 18
  • 106
  • 111
  • Good point, it only works for big-endian ints. Typically it's used for transport, however the OP can reference http://stackoverflow.com/questions/5712066/send-a-int-from-objective-c-to-java-by-socket-but-the-value-changed-in-java-side/5712128#5712128 for reading little-endian if necessary. – WhiteFang34 Apr 22 '11 at 01:12
  • @road to yamburg - it is not unreasonable to assume that the OP is trying to read a binary file created on the phone itself or by some application on another Java / Android platform ... and that the file will be big-endian. – Stephen C Apr 22 '11 at 04:34
  • The problem is that my array is 1.5MB in size, and any iteration over the elements in Java like this is too slow. – Chris Apr 23 '11 at 04:09
1

Have you tried the classes of the NIO package? The class ByteBuffer has a method asIntBuffer(), where you can get an IntBuffer view of the ByteBuffer. Then you should be able to get the content as integers by calling get(int[] dst).

The initial ByteBuffer is available by using file channels.

mreichelt
  • 12,359
  • 6
  • 56
  • 70
  • Amount of operations involved in creating IntBuffer is quite large. I wonder if manual a<<24+b<<16+c<<8+d would be much faster. – Vladimir Dyuzhev Apr 22 '11 at 01:03
  • I was trying to use the NIO package, but was running into trouble using the array() method which doesn't exist for a buffer created from a byte[]. But it looks like get() is the answer. – Chris Apr 23 '11 at 04:11
  • @road to yamburg: This might be. But chances are high that the actual implementation of byte buffers are in native code, and therefore are much faster. In fact, I believe the NIO package of Android is done natively - I could not find a working byte buffer implementation in Java there. @Chris: Exactly - you can not use the array() method because your IntBuffer is not backed by an int[] array, but a byte[] array. The documentation of array() on http://developer.android.com/reference/java/nio/IntBuffer.html#array%28%29 makes this pretty clear. :) – mreichelt Apr 23 '11 at 11:34
0

I ran a fairly careful experiment using serialize/deserialize, DataInputStream vs ObjectInputStream, both based on ByteArrayInputStream to avoid IO effects. For a million ints, readObject was about 20msec, readInt was about 116. The serialization overhead on a million-int array was 27 bytes.

Having said that, object serialization is sort of evil, and you have to have written the data out with a Java program.

Tim Bray
  • 1,653
  • 11
  • 16