14

I am working on a Java application where I need to send an array of 500,000 integers from one Android phone to another Android phone over a socket connection as quickly as possible. The main bottleneck seems to be converting the integers so the socket can take them, whether I use ObjectOutputStreams, ByteBuffers, or a low level mask-and-shift conversion. What is the fastest way to send an int[] over a socket from one Java app to another?

Here is the code for everything I've tried so far, with benchmarks on the LG Optimus V I'm testing on (600 MHz ARM processor, Android 2.2).

Low level mask-and-shift: 0.2 seconds

public static byte[] intToByte(int[] input)
{
    byte[] output = new byte[input.length*4];

    for(int i = 0; i < input.length; i++) {
        output[i*4] = (byte)(input[i] & 0xFF);
        output[i*4 + 1] = (byte)((input[i] & 0xFF00) >>> 8);
        output[i*4 + 2] = (byte)((input[i] & 0xFF0000) >>> 16);
        output[i*4 + 3] = (byte)((input[i] & 0xFF000000) >>> 24);
    }

    return output;
}

Using ByteBuffer and IntBuffer: 0.75 seconds

public static byte[] intToByte(int[] input)
{
    ByteBuffer byteBuffer = ByteBuffer.allocate(input.length * 4);        
    IntBuffer intBuffer = byteBuffer.asIntBuffer();
    intBuffer.put(input);

    byte[] array = byteBuffer.array();

    return array;
}

ObjectOutputStream: 3.1 seconds (I tried variations of this using DataOutPutStream and writeInt() instead of writeObject(), but it didn't make much of a difference)

public static void sendSerialDataTCP(String address, int[] array) throws IOException
{
    Socket senderSocket = new Socket(address, 4446);

    OutputStream os = senderSocket.getOutputStream();
    BufferedOutputStream  bos = new BufferedOutputStream (os);
    ObjectOutputStream oos = new ObjectOutputStream(bos);
    oos.writeObject(array);

    oos.flush();
    bos.flush();
    os.flush();
    oos.close();
    os.close();
    bos.close();

    senderSocket.close();
}

Lastly, the code I used to send byte[]: takes an addition 0.2 seconds over the intToByte() functions

public static void sendDataTCP(String address, byte[] data) throws IOException
{
    Socket senderSocket = new Socket(address, 4446);

    OutputStream os = senderSocket.getOutputStream();
    os.write(data, 0, data.length);
    os.flush();

    senderSocket.close();
}

I'm writing the code on both sides of the socket so I can try any kind of endianness, compression, serialization, etc. There's got to be a way to do this conversion more efficiently in Java. Please help!

Jeremy Fowers
  • 268
  • 2
  • 9
  • 1
    what happens if you flush outputstrem first? – huseyin tugrul buyukisik Sep 07 '12 at 14:26
  • @tuğrulbüyükışık: I just tried it out, and it takes 50% longer when I flush the outputstream before the objectoutputstream: 9 seconds EDIT: sorry, I was going fast and took the benchmark wrong. It actually makes no difference. – Jeremy Fowers Sep 07 '12 at 14:29
  • How long does it take to write the data to a memory stream as opposed to the socket? To give you some baseline. This will give you baseline for writing and isolate that from sending over the socket. – grieve Sep 07 '12 at 14:30
  • Is your include from java.io or from CORBA ? – huseyin tugrul buyukisik Sep 07 '12 at 14:31
  • @grieve: I'm not totally sure what you mean by a memory stream, but calling System.arraycopy() to duplicate the int[] in memory only takes a few hundredths of a second (order of magnitude faster than casting to byte[]). – Jeremy Fowers Sep 07 '12 at 14:32
  • @tuğrulbüyükışık my include is from java.io – Jeremy Fowers Sep 07 '12 at 14:32
  • I mean create an outputstream backed by memory instead of the socket, and see how long that takes. – grieve Sep 07 '12 at 14:32
  • 500,000 integers is 2 Mb, or ~20 Mbits, which on a 100Mbit network is about 0.2 seconds. Ignoring network overheads and any processing lags introduced by the OS on each end. What is your network speed, and what are you expecting for performance? – parsifal Sep 07 '12 at 14:38
  • Why are you using .writeObject() ? since your array is int you should write .writeInt() :D – huseyin tugrul buyukisik Sep 07 '12 at 14:39
  • @parsifal I'm not too worried about the network performance; I accept that I can't do much better than 0.2 seconds there. What I need help with is converting from a int[] to a byte[] in under 0.2 seconds (see the intToByte function). – Jeremy Fowers Sep 07 '12 at 14:40
  • If you just need the data in a byte array you can use ByteArrayOutputStream. – grieve Sep 07 '12 at 14:41
  • @grieve I need to send the int[] over a socket and the two ways I'm aware of are (a) convert the int[] to a byte[] and pass it to the socket's output stream, or (b) serialize the int[] and pass that to the socket's output stream. I'll look into ByteArrayOutputStream – Jeremy Fowers Sep 07 '12 at 14:45
  • Do you care about endianess at all? – grieve Sep 07 '12 at 14:47
  • 1
    One problem that I see is that you're creating a large destination array, which is never a good thing on a memory-limited device. I'd suggest just using `DataOutputStream` wrapping a `BufferedOutputStream`, and not trying to do the conversion yourself. – parsifal Sep 07 '12 at 14:48
  • @grieve I'm transferring the data between two identical Optimus Vs right now, so I can play with the endianness if I have to. Would that help? – Jeremy Fowers Sep 07 '12 at 14:49
  • But, really, you might just be banging up against the limits of your processor. 600 Mhz roughly (very roughly) translates into 600 MM operations per second. Some percentage of that goes to the OS, some percentage goes to your program, some percentage is wasted. You're in a loop that has 0.5MM iterations, so every operation in that loop takes a significant percentage of the available processor. – parsifal Sep 07 '12 at 14:49
  • @parsifal: That will work well if the other side can use a DataInputStream to read it. I am not sure if he controls both sides. – grieve Sep 07 '12 at 14:49
  • @grieve and parsifal: I do indeed control both sides. I'll update the question to reflect that. – Jeremy Fowers Sep 07 '12 at 14:51
  • 1
    @grieve - no difference from the current approach that he's using; `DataOutputStream` (as opposed to `ObjectOutputStream`) writes raw binary data. – parsifal Sep 07 '12 at 14:51
  • One other thought is that you can potentially compress the data if you are not CPU bound. If it is not pure randomness it should improve the time it takes to send over the network. http://docs.oracle.com/javase/6/docs/api/index.html?java/io/ByteArrayOutputStream.html – grieve Sep 07 '12 at 14:53
  • @tuğrulbüyükışık I tried implementing with writeInt() and while its faster, its only 5% faster. – Jeremy Fowers Sep 07 '12 at 15:03
  • @JeremyFowers Take a look at my answer below :) – Eng.Fouad Sep 07 '12 at 15:04
  • My suggestion would be, rather than convert a large int[] to a large byte[] all at once, stream the data by breaking it up into chunks and sending a little at a time. As long as the output stream is buffered, the write should return immediately and allow you to be converting the next chunk while the last chunk is being sent over the network. – Alex Sep 07 '12 at 15:11

4 Answers4

5

As I noted in a comment, I think you're banging against the limits of your processor. As this might be helpful to others, I'll break it down. Here's your loop to convert integers to bytes:

    for(int i = 0; i < input.length; i++) {
        output[i*4] = (byte)(input[i] & 0xFF);
        output[i*4 + 1] = (byte)((input[i] & 0xFF00) >>> 8);
        output[i*4 + 2] = (byte)((input[i] & 0xFF0000) >>> 16);
        output[i*4 + 3] = (byte)((input[i] & 0xFF000000) >>> 24);
    }

This loop executes 500,000 times. You 600Mhz processor can process roughly 600,000,000 operations per second. So each iteration of the loop will consume roughly 1/1200 of a second for every operation.

Again, using very rough numbers (I don't know the ARM instruction set, so there may be more or less per action), here's an operation count:

  • Test/branch: 5 (retrieve counter, retrieve array length, compare, branch, increment counter)
  • Mask and shift: 10 x 4 (retrieve counter, retrieve input array base, add, retrieve mask, and, shift, multiply counter, add offset, add to output base, store)

OK, so in rough numbers, this loop takes at best 55/1200 of a second, or 0.04 seconds. However, you're not dealing with best case scenario. For one thing, with an array this large you're not going to benefit from a processor cache, so you'll introduce wait states into every array store and load.

Plus, the basic operations that I described may or may not translate directly into machine code. If not (and I suspect not), the loop will cost more than I've described.

Finally, if you're really unlucky, the JVM hasn't JIT-ed your code, so for some portion (or all) of the loop it's interpreting bytecode rather than executing native instructions. I don't know enough about Dalvik to comment on that.

parsifal
  • 582
  • 3
  • 4
  • 1
    I agree that running the loop you posted probably maxes out my processor. I guess my real question is: why doesn't Java have a better way to send an int[] through a socket than the brute force mask-and-shift approach? – Jeremy Fowers Sep 07 '12 at 15:16
  • 4
    I'm tempted to say "because there's no magic," but it's really a series of implementation choices, one of which is that memory is typed. Yes, if you were using C you could create a buffer, write into that buffer with an `int*`, and not have to worry about converting the integers into bytes. And if you had an OS that would do DMA transfers from process memory to device memory, you could get another boost. The goal of Java is to keep you away from the hardware. If you need to get close to the hardware, then you need to think about a language that lets you get there. – parsifal Sep 07 '12 at 15:24
  • 1
    Do you think that switching to the Android NDK to do the conversion and socket transfer in native code would help? Or would I run into the same issues when I send my huge int[] array to the native code? – Jeremy Fowers Sep 07 '12 at 15:27
  • No idea. You still need to push the data across the Java/native boundary. I'd look more at changing my code so that I don't have to push 500k values at a time. Or spec better hardware. – parsifal Sep 07 '12 at 15:39
  • This is actually a research project where I'm trying to find out just how fast I can process large amounts of data using slow hardware, so smaller arrays and faster hardware aren't an option :( – Jeremy Fowers Sep 07 '12 at 15:44
  • @JeremyFowers I'd say try doing it with the NDK, maybe you get direct access to the Java memory and can send it without conversion. Sockets and access to Java data etc is available via the NDK. E.g. direct `ByteBuffer` uses natively allocated memory ([source](https://android.googlesource.com/platform/dalvik/+/master/vm/oo/Array.cpp)) – zapl Sep 07 '12 at 16:42
1

Java was IMO never intended to be able efficiently reinterpret a memory region from int[] to byte[] like you could do in C. It doesn't even have such a memory address model.

You either need to go native to send the data or you can try to find some micro optimizations. But I doubt you will gain a lot.

E.g. this could be slightly faster than your version (if it works at all)

public static byte[] intToByte(int[] input)
{
    byte[] output = new byte[input.length*4];

    for(int i = 0; i < input.length; i++) {
        int position = i << 2;
        output[position | 0] = (byte)((input[i] >>  0) & 0xFF);
        output[position | 1] = (byte)((input[i] >>  8) & 0xFF);
        output[position | 2] = (byte)((input[i] >> 16) & 0xFF);
        output[position | 3] = (byte)((input[i] >> 24) & 0xFF);
    }
    return output;
}
zapl
  • 63,179
  • 10
  • 123
  • 154
  • I don't necessarily need to convert the int[] to a byte[]. I just need to send the int[] quickly, and right now converting to a byte[] is fastest. Is there a way to send the int[] without converting it? – Jeremy Fowers Sep 07 '12 at 14:46
  • `OutputStream` is "A writable sink for bytes." No way to send int directly. All the subclasses will do a conversion at some point. – zapl Sep 07 '12 at 14:49
  • I don't understand why writing the raw, serialized bytes of the int[] (using ObjectOutputStream) is way slower than manually extracting the byte data out of the int[] (as you have suggested). – Jeremy Fowers Sep 07 '12 at 14:55
  • 1
    `ObjectOutputStream` does about the same but writes an object descriptor and such things on the stream. That's unnecessary extra work. – zapl Sep 07 '12 at 15:02
  • 1
    @Jeremy: ObjectOutputStream writes serialized version of the array as an Java object, therefore it includes a lot more information like it's implementation because the receiver has to be able to reconstruct the whole array without even having to know that it is an int array. – Robert Sep 07 '12 at 15:07
1

If you're not adverse to using a library, you might want to check out Protocol Buffers from Google. It's built for much more complex object serialization, but I'd bet that they worked hard to figure out how to quickly serialize an array of integers in Java.

EDIT: I looked in the Protobuf source code, and it uses something similar to your low-level mask and shift.

japreiss
  • 11,111
  • 2
  • 40
  • 77
1

I would do it like this:

Socket senderSocket = new Socket(address, 4446);

OutputStream os = senderSocket.getOutputStream();
BufferedOutputStream bos = new BufferedOutputStream(os);
DataOutputStream dos = new DataOutputStream(bos);

dos.writeInt(array.length);
for(int i : array) dos.writeInt(i);
dos.close();

On the other side, read it like:

Socket recieverSocket = ...;
InputStream is = recieverSocket.getInputStream();
BufferedInputStream bis = new BufferedInputStream(is);
DataInputStream dis = new DataInputStream(bis);

int length = dis.readInt();
int[] array = new int[length];

for(int i = 0; i < length; i++) array[i] = dis.readInt();
dis.close();
Eng.Fouad
  • 115,165
  • 71
  • 313
  • 417