I am measuring the latency of a TCP/IP connection on windows over the loopback interface and getting about 4ms for the time from a message is sent to a response is received.
For RPC purposes there is a TCF layer on top of TCP/IP. The messages sent and received contain only a single character as payload in addition to the TCF framing.
The "server" which handles the commands are implemented in C++ using boost asio. The "client" sending commands is a Python script that uses the Python TCF reference implementation.
I have tried setting the socket options to TCP_NODELAY to disable the Nagle algorithm and experimented with various buffersizes for the socket, but the roundtrip time remains at about 4ms. I was expecting it to be quite a bit lower.
Profiling on the C++ side shows that it spends about 50% of it's execution time waiting for commands, so the next step will be to try and replace the python script with a C++ implementation, but it would be nice to know what one can expect for the roundtrip time on the loopback interface.
This SO, question:
Linux Loopback performance with TCP_NODELAY enabled
is related but did not quite answer my question.