0

I have two Python scripts, one TCP-server sending data (at a rate of 1/256 times a second) and a TCP-client receiving data. In the client script, I print the length of the received data. I sent the string "5.8" from the server (thus data of length 3).

When client and server are on the same machine: The length of data received is always 3. When client and server are on different machines in the same local network: The length of data differs but is around 39 (13 times the data sent).

Is there a possible explanation for this discrepancy?

I think the network adding this much latency is unlikely, because the command line "ping" prints at most 2 ms latency with the largest amount of data.

IMPORTANT: I'm using Python 2.7.

import socket

def server():
    host = 'localhost' # replace with IP address in case client is on another machine
    port = 5051

    s = socket.socket()
    s.bind((host, port))
    s.listen(1)
    client_socket, adress = s.accept()

    while True:
        client_socket.send('a'.encode())
    client_socket.close()

if __name__ == '__main__':
    server()
import socket, random, time

def client():
    host = 'localhost' # replace with IP address in case client is on another machine
    port = 5051

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s_err = s.connect_ex((host, port))
    print(s_err)

    while True:
        data = s.recv(2048)
        print(len(data)) # returns different values depending on client location
    s.close()

if __name__ == '__main__':
    client()
  • Your server's data is being delayed and buffered on the sending side of the connection by "Nagle's Algorithm" in TCP. See https://en.wikipedia.org/wiki/Nagle%27s_algorithm for details, and see https://stackoverflow.com/questions/31826762/python-socket-send-immediately for the way to suppress Nagle when sending in Python. Regardless of Nagle, and even when sending on a local connection, as @maxim-egorushkin says, it's always possible for TCP to receive data in chunks that are not the same size as the sender's writes. Your program must be prepared to deal with that situation. – ottomeister May 30 '19 at 02:40

3 Answers3

0

Is there a possible explanation for this discrepancy?

TCP doesn't have a concept of a message. Data sent using multiple send calls can be received with one recv call and vice versa.

TCP is a stream where you need to delimit the messages yourself, so that the reader can determine message boundaries. Most common ways:

  1. Prefix messages with fixed message length.
  2. Read until a message delimiter is encountered, e.g. \n.
Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • How would such a delimitation look like? Are there examples? – Nico Autia Jun 11 '19 at 12:31
  • @NicoAutia Any protocol with variable size messages uses one of the above methods. For example, look into `Chat` example in https://www.boost.org/doc/libs/1_67_0/doc/html/boost_asio/examples/cpp11_examples.html – Maxim Egorushkin Jun 11 '19 at 12:38
0

I would assume that the TCP/IP stack of the server has just concatenated small packets to have a better network throughput. To send only 3 data of payload, you would have a large overhead: TCP encapsulation + IP encapsulation. Many TCP/IP stacks use that.

Anyway, you should neither worry for that, nor be surprised that it happens or not: TCP is a stream protocol and the only guarantee is that all the bytes sent from one side will arrive in same order at the other side. Any equipment (sender receiver or any other on the network) may concatenate of fragment packets.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
0

This is what worked for me: I changed the buffer size of recv() to the number of bytes I want to receive, so in this case

data = s.recv(3)

instead of

data = s.recv(2048)

While on the local machine, something automatically decided that the data can be sent in the smallest packages, while this was not the case when using a local network. So I forced the size of the data sent.

This of course potentially leads to problems of lag. So elsewhere I need to make sure that the remaining data are emptied regularly with a large buffer size, or close the connection if not needed. Not sure about the last one, since I'm still a beginner.