0

I have created a very simple .proto file:

syntax = "proto2";


package Metadata; 


option java_package = "metadata";
option java_outer_classname = "Metadata";
option optimize_for=SPEED;
message PutMessage {
 optional string key =1;
 optional int64 dt = 2; 
}

I have a create a simple client-server program that server echoes client message. All messages are PutMessage:

Server:

public class Server {
 public static void main(String args[]) throws IOException, InterruptedException {
 ServerSocket ss = new ServerSocket(8080);
 System.out.println("Listening for client");
 Socket socket = ss.accept();
 System.out.println("Server Results: ");
 while (true) {
 Long parseStart = System.nanoTime();
 PutMessage sm = PutMessage.parseDelimitedFrom(socket.getInputStream());
 Long parseEnd = System.nanoTime();

 Long writeStart = System.nanoTime();
 sm.writeDelimitedTo(socket.getOutputStream());
 Long writeEnd = System.nanoTime();

 System.out.println("Parse time: " + (parseEnd - parseStart));
 System.out.println("Write time: " + (writeEnd - writeStart));

 }


 }
}

Client:

public class Client {
 public static void main (String args[]) throws IOException, InterruptedException{
 Socket socket = new Socket("127.0.0.1", 8080);
 int A = new Integer(args[0]);
 PutMessage cm = PutMessage.newBuilder().setDt(3434).setKey("sdfwsdf").build();
 System.out.println("Client Results7: ");
 for (int i=0;i < A; i++){
 Long writeStart = System.nanoTime();
 cm.writeDelimitedTo(socket.getOutputStream());
 Long writeEnd = System.nanoTime();

 Long parseStart = System.nanoTime();
 cm.parseDelimitedFrom(socket.getInputStream());
 Long parseEnd = System.nanoTime();

 System.out.println("Write time: " + (writeEnd - writeStart));
 System.out.println("Parse time: " + (parseEnd - parseStart));
 }
 }
}

When I run client and server on Windows it is very fast. But when I run on Ubuntu, it takes a long time (70(ms)) for the server to parse the client message: These are some results when I pass 3 to the client:

All in nanoseconds

Windows:

Client Results: 
Write time: 54401508
Parse time: 95818962
Write time: 279563
Parse time: 201593
Write time: 200568
Parse time: 138500



Server Results: 
Parse time: 207099065
Write time: 42572640
Parse time: 20808241
Write time: 156966
Parse time: 209801
Write time: 124649

Ubuntu:

Client Results:
Write time: 31205019
Parse time: 86399222
Write time: 101132
Parse time: 40619478
Write time: 214496
Parse time: 79164985


Server Results:
Parse time: 183947743
Write time: 25819286
Parse time: 28680184
Write time: 292955
Parse time: 79299932
Write time: 298012

I found that, if I remove setDt from put message and when it is only setKey, it is fast on Ubuntu too. But I need to have dt. I have no clue, why it is fast on Windows and slow on Ubuntu. I thought maybe my machine is different, but without setDt it is fast on Ubuntu, so the problem is not the Hardware.

I have also tried proto3, same results. Any help is greatly appreciated.

Update: Both OSs are 64-bit. Java versions are:

Ubuntu:

java version "1.8.0_66" 
Java(TM) SE Runtime Environment (build 1.8.0_66-b17) 
Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode) 

Windows:

java version "1.8.0_151" 
Java(TM) SE Runtime Environment (build 1.8.0_151-b12) 
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)

Thanks

Update: I redid the experiment on two AWS EC2 instances. One Windows and the other Amazon Linux. I didn't use Ubuntu this time, but I got same results. Again I got much faster results on Windows. It is really weird!

spartanHp
  • 11
  • 2
  • What JVM are you using on the Linux side and on the Windows side? Are you using a server or client VM? Are you warming up the JVM (doesn't look like it)? – Daniel Pryden Nov 21 '17 at 01:35
  • 2
    See also: [How do I write a correct microbenchmark in Java?](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java). – Daniel Pryden Nov 21 '17 at 01:37
  • The code that benchmark is the same for windows and Ubuntu, why one is fast the other is slow? – spartanHp Nov 21 '17 at 01:40
  • It's not the code that's fast or slow. What you care about is what that code gets JITted into, and that's going to depend very much on the *exact* version of the JVM you're running it on, and the *exact* flags you're passing to it. Without that information, the only thing anyone can do to help you is guess. – Daniel Pryden Nov 21 '17 at 01:43
  • For the first time I agree that the time is high because of warm up issue, but after first time that classes are loaded, why again I get high response time, and why it only happens in Ubuntu? – spartanHp Nov 21 '17 at 01:47
  • I check the versions: Ubuntu: java version "1.8.0_66" Java(TM) SE Runtime Environment (build 1.8.0_66-b17) Java HotSpot(TM) 64-Bit Server VM (build 25.66-b17, mixed mode) Windows: java version "1.8.0_151" Java(TM) SE Runtime Environment (build 1.8.0_151-b12) Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode) – spartanHp Nov 21 '17 at 01:48
  • Also: is the hardware comparable between these two machines? Ideally you'd run your test on two identical machines, or run the Linux and Windows tests on VMs on the same physical machine. Given that you're seeing a performance anomaly with a 64-bit value, have you confirmed that both machines, both OSs, and both JVMs are all 64-bit? – Daniel Pryden Nov 21 '17 at 01:49
  • Please [edit] your question to include this essential information, rather than putting it only in a comment. – Daniel Pryden Nov 21 '17 at 01:51
  • 1
    It would also be useful to try separating the socket-related parts of your program from the protobuf parts, and benchmark them independently. The performance issue may not be coming from where you expect... e.g. the socket streams may have different performance, and the extra 8 bytes in the one case might be bumping over an internal buffer size or something. Creating a [mcve] is usually a good exercise. – Daniel Pryden Nov 21 '17 at 01:57

2 Answers2

1

You are measuring IO time, not parse time. If you want to measure parse time, parse from a byte array or a ByteArrayInputStream.

The difference between windows and Ubuntu in this particular case may stem from the IO implementation waiting for a certain time for more data before it is handed over (in order to reduce the number overhead). Things I'd try are closing or flushing the stream on server -- or just sending some additional data there.

Stefan Haustein
  • 18,427
  • 3
  • 36
  • 51
1

I found out, instead of calling writDelimetedTo and parseDelimitedFrom, if we write bytes, there is no delay:

Server:

public class CodedServer {
    public static void main(String args[]) throws IOException, InterruptedException {
        ServerSocket ss = new ServerSocket(8080);
        System.out.println("Listening for client8");
        Socket socket = ss.accept();
        System.out.println("Server Results: ");
        CodedOutputStream out = CodedOutputStream.newInstance(socket.getOutputStream());
        CodedInputStream in = CodedInputStream.newInstance(socket.getInputStream());
        while (true) {

            Long parseStart = System.nanoTime();
            int size = in.readInt32();
            byte[] result = in.readRawBytes(size);
            PutMessage cm = PutMessage.parseFrom(result);
            Long parseEnd = System.nanoTime();

            Long writeStart = System.nanoTime();
            out.writeInt32NoTag(cm.getSerializedSize());
            cm.writeTo(out);
            out.flush();
            Long writeEnd = System.nanoTime();

            System.out.println("Parse time: " + (parseEnd - parseStart));
            System.out.println("Write time: " + (writeEnd - writeStart));

        }

    }
}

Client:

public class CodedClient {
    public static void main (String args[]) throws IOException, InterruptedException{
        Socket socket = new Socket(args[0], 8080);
        int A = new Integer(args[1]);
        PutMessage cm = PutMessage.newBuilder().setDt("sdfsdfsdf").setKey("Test Stringdfgdsfgsdfgsdfgsdfgsdfgsdfgsdfgsdfgsdfgsdfgsdfgsdfg").build();

        CodedOutputStream out = CodedOutputStream.newInstance(socket.getOutputStream());
        CodedInputStream in = CodedInputStream.newInstance(socket.getInputStream());

        System.out.println("Client Results9: ");
        for (int i=0;i < A; i++){
            Long writeStart = System.nanoTime();
            out.writeInt32NoTag(cm.getSerializedSize());
            cm.writeTo(out);
            out.flush();
            Long writeEnd = System.nanoTime();


            Long parseStart = System.nanoTime();
            int size = in.readInt32();
            byte[] result = in.readRawBytes(size);
            cm = PutMessage.parseFrom(result);
            Long parseEnd = System.nanoTime();

            System.out.println("Write time: " + (writeEnd - writeStart));
            System.out.println("Parse time: " + (parseEnd - parseStart));
        }
        System.out.println(cm.toString());
    }
}

I am happy to find a solution, but I wounder why the implementation of two standard functions by Google protocol buffer should cause such significant delay?

spartanHp
  • 11
  • 2